{
"cells": [
{
"cell_type": "markdown",
"id": "7fb27b941602401d91542211134fc71a",
"metadata": {},
"source": [
"```{post} 2023-08-23\n",
":author: Saul\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "a611c138-1afd-4d6f-9578-4f358d2438eb",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"```{eval-rst}\n",
"`Presentation as HTML <../pldi_2023_presentation.slides.html>`_\n",
"```\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "68dd13a3-5864-4764-a528-1eedac698723",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"# `egglog` in Python\n",
"\n",
"In this talk I will try to cover how `egglog` in Python can:\n",
"\n",
"- Enable decentralized collaboration in the Python data science ecosystem.\n",
"- Provide a faithful authoring environment for `egglog`.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1f7b1c58-0f6f-4d43-a8b0-c04b5d2f5b08",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": [
"remove-input"
]
},
"outputs": [
{
"data": {
"image/svg+xml": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import SVG, display\n",
"\n",
"with open(\"big_graph.svg\") as f:\n",
" display(SVG(f.read()))"
]
},
{
"cell_type": "markdown",
"id": "d4ef7a6c-5fca-46ff-89d6-ab7e659f1c82",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"_Saul Shanabrook @ EGRAPHS Workshop - PLDI '23_\n"
]
},
{
"cell_type": "markdown",
"id": "72e644b5-c987-45de-87d7-ee9da885b85f",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"## Open Source Data Science Ecosystem in Python\n",
"\n",
"> The term “ecosystem” is often used to describe the modern open-source scientific software. In biology, the term “ecosystem” is defined as a biological community of interacting organisms and their physical environment. Modern open-source scientific software development occurs in a similarly interconnected and interoperable fashion.\n",
"\n",
"from [Jupyter Meets the Earth: Ecosystem](https://jupytearth.org/jupyter-resources/introduction/ecosystem.html)\n",
"\n",
"![](https://jupytearth.org/_images/python-stack.png)\n"
]
},
{
"cell_type": "markdown",
"id": "90785f75-e933-469f-b585-127f0e4fc584",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"### Aims\n",
"\n",
"- How can the tools we build foster greater **resiliancy, collaboration, and interdependence** in this ecosystem?
\n",
"- How can they help it **stay flexible enough to adapt to the changing computational landscape** to empower users and authors?\n",
"\n",
"### What role could `egglog` play?\n",
"\n",
"- Bring the PL community closer to this space, providing theoretical frameworks for thinking about composition and language.\n",
"- Constrained type system could support decentralized interopability and composition between data science libraries.\n"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "d06e53f0-6080-4e0e-b5d2-2e4140ebed72",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"outputs": [],
"source": [
"from __future__ import annotations\n",
"\n",
"from egglog import *"
]
},
{
"cell_type": "markdown",
"id": "0b49d9a5-ef01-4f2b-a6f7-1bc1e5087201",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"### Other Python EGraph Libraries\n",
"\n",
"- [`riswords/quiche`](https://github.com/riswords/quiche): Optimizing Python ASTs\n",
"- [`egraphs-good/snake-egg`](https://github.com/egraphs-good/snake-egg): Generic bindings for `egg`, can bring any Python objects as expressions.\n",
"- [EGraph library added to `ibis`](https://github.com/ibis-project/ibis/pull/5781): Conversions between dataframe syntax and SQL dialects\n"
]
},
{
"cell_type": "markdown",
"id": "c4c37446-3426-4b57-a402-2a234b54d930",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"TODO: Put this first, Say it's for library authors\n",
"\n",
"Semantics of python and egglog\n"
]
},
{
"cell_type": "markdown",
"id": "eadf7129-68f3-4fcc-87a4-57e064f66fc8",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Started with `snake-egg`\n",
"- Didn't want to re-invent the wheel, stay abreast of recent developments and research\n",
"- Second piece that interests me\n",
" - Unlike `egg` there are some builtin sorts, and can build user defined sorts on top of those\n",
" - No host language conditions or data structures\n",
" - Helps with optimization, more constrained\n",
" - -> De-centers algorithms based on value, move to based on type. Everything becomes an interface.\n",
" - Social dynamics, goal is ability to inovate and experiment, while still supporting existing use cases\n",
" - New dataframe library comes out, supporting custom hardware. How dow we use it without rewriting code?\n",
" - How do we have healthy ecosystem within these tools? Power\n",
" - If it's too hard, encourages centralized monopolistic actors to step in provide one stop shop solutions for users.\n",
" - Active problem in the community, with things like trying to standardize on interop.\n",
" - Before getting too abstract, let's go to an example\n"
]
},
{
"cell_type": "markdown",
"id": "a037bf31-2637-470a-ae26-b56538f689a6",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"### A story about Arrays\n"
]
},
{
"cell_type": "markdown",
"id": "082d9334-86e2-42a1-92f4-08563902b064",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- This is one path through a huge maze of use cases.\n",
"- Does not represent one killer example, but is an area I am familar with based on my previous work\n"
]
},
{
"cell_type": "markdown",
"id": "24955a1a-d3d6-44d8-a7ee-b5a87bd2a153",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"source": [
"#### 1. Someone makes an NDArray library...\n"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "be2f1cb2-81c1-485d-8eb9-06c4d8b3c6ba",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [],
"source": [
"ndarray_mod = Module()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "b0d83999-9aac-4881-a81a-385541b621fb",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"outputs": [],
"source": [
"@ndarray_mod.class_\n",
"class Value(Expr):\n",
" def __init__(self, v: i64Like) -> None: ...\n",
"\n",
" def __mul__(self, other: Value) -> Value: ...\n",
"\n",
" def __add__(self, other: Value) -> Value: ...\n",
"\n",
"\n",
"i, j = vars_(\"i j\", i64)\n",
"ndarray_mod.register(\n",
" rewrite(Value(i) * Value(j)).to(Value(i * j)),\n",
" rewrite(Value(i) + Value(j)).to(Value(i + j)),\n",
")\n",
"\n",
"\n",
"@ndarray_mod.class_\n",
"class Values(Expr):\n",
" def __init__(self, v: Vec[Value]) -> None: ...\n",
"\n",
" def __getitem__(self, idx: Value) -> Value: ...\n",
"\n",
" def length(self) -> Value: ...\n",
"\n",
" def concat(self, other: Values) -> Values: ...\n",
"\n",
"\n",
"@ndarray_mod.register\n",
"def _values(vs: Vec[Value], other: Vec[Value]):\n",
" yield rewrite(Values(vs)[Value(i)]).to(vs[i])\n",
" yield rewrite(Values(vs).length()).to(Value(vs.length()))\n",
" yield rewrite(Values(vs).concat(Values(other))).to(Values(vs.append(other)))"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "0a9c6dfd-21b6-4e28-abe4-1bc8d8229a5a",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [],
"source": [
"@ndarray_mod.class_\n",
"class NDArray(Expr):\n",
" def __getitem__(self, idx: Values) -> Value: ...\n",
"\n",
" def shape(self) -> Values: ...\n",
"\n",
"\n",
"@ndarray_mod.function\n",
"def arange(n: Value) -> NDArray: ..."
]
},
{
"cell_type": "markdown",
"id": "36dd2603-c60e-42d0-a5d5-98043ed307a5",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Basic\n",
"- One function, range, get shape and index into array\n",
"- Very different from existing paradigms in Python... Inheritance, multi-dispatch, dunder protocols.\n",
" - Entirely open protocol.\n",
" - Anyone else could define ways to create arrays\n",
" - About mathematical definition really. This is from M\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "0f36b591-9a20-4405-80cb-3dbfea11b129",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"Restifo Mullin, Lenore Marie, \"A mathematics of arrays\" (1988). _Electrical Engineering and Computer Science - Dissertations_. 249.\n"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "3e020ae8-29a3-4e35-8510-aa2639c87701",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [],
"source": [
"@ndarray_mod.register\n",
"def _(n: Value, idx: Values, a: NDArray):\n",
" yield rewrite(arange(n).shape()).to(Values(Vec(n)))\n",
" yield rewrite(arange(n)[idx]).to(idx[Value(0)])"
]
},
{
"cell_type": "markdown",
"id": "81562baf-da04-4734-bc79-b0aedf11ab02",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Rules to compute shape and index into arange.\n"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "d46c1977-fa8c-4410-90c6-619ef3659a87",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"EGraph(_flatted_deps=[Module(_flatted_deps=[], _mod_decls=ModuleDeclarations(_decl=Declarations(_functions={'arange': FunctionDecl(arg_types=(TypeRefWithVars(name='Value', args=()),), return_type=TypeRefWithVars(name='NDArray', args=()), var_arg_type=None)}, _classes={'Value': ClassDecl(methods={'__mul__': FunctionDecl(arg_types=(TypeRefWithVars(name='Value', args=()), TypeRefWithVars(name='Value', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), '__add__': FunctionDecl(arg_types=(TypeRefWithVars(name='Value', args=()), TypeRefWithVars(name='Value', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None)}, class_methods={'__init__': FunctionDecl(arg_types=(TypeRefWithVars(name='i64', args=()),), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None)}, class_variables={}, n_type_vars=0), 'Values': ClassDecl(methods={'__getitem__': FunctionDecl(arg_types=(TypeRefWithVars(name='Values', args=()), TypeRefWithVars(name='Value', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), 'length': FunctionDecl(arg_types=(TypeRefWithVars(name='Values', args=()),), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), 'concat': FunctionDecl(arg_types=(TypeRefWithVars(name='Values', args=()), TypeRefWithVars(name='Values', args=())), return_type=TypeRefWithVars(name='Values', args=()), var_arg_type=None)}, class_methods={'__init__': FunctionDecl(arg_types=(TypeRefWithVars(name='Vec', args=(TypeRefWithVars(name='Value', args=()),)),), return_type=TypeRefWithVars(name='Values', args=()), var_arg_type=None)}, class_variables={}, n_type_vars=0), 'NDArray': ClassDecl(methods={'__getitem__': FunctionDecl(arg_types=(TypeRefWithVars(name='NDArray', args=()), TypeRefWithVars(name='Values', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), 'shape': FunctionDecl(arg_types=(TypeRefWithVars(name='NDArray', args=()),), return_type=TypeRefWithVars(name='Values', args=()), var_arg_type=None)}, class_methods={}, class_variables={}, n_type_vars=0)}, _constants={}, _egg_fn_to_callable_refs=defaultdict(, {'Value.__init__': {ClassMethodRef(class_name='Value', method_name='__init__')}, 'Value.__mul__': {MethodRef(class_name='Value', method_name='__mul__')}, 'Value.__add__': {MethodRef(class_name='Value', method_name='__add__')}, '!=': {MethodRef(class_name='Value', method_name='__ne__'), MethodRef(class_name='NDArray', method_name='__ne__'), MethodRef(class_name='Values', method_name='__ne__')}, 'Values.__init__': {ClassMethodRef(class_name='Values', method_name='__init__')}, 'Values.__getitem__': {MethodRef(class_name='Values', method_name='__getitem__')}, 'Values.length': {MethodRef(class_name='Values', method_name='length')}, 'Values.concat': {MethodRef(class_name='Values', method_name='concat')}, 'NDArray.__getitem__': {MethodRef(class_name='NDArray', method_name='__getitem__')}, 'NDArray.shape': {MethodRef(class_name='NDArray', method_name='shape')}, 'arange': {FunctionRef(name='arange')}}), _callable_ref_to_egg_fn={ClassMethodRef(class_name='Value', method_name='__init__'): 'Value.__init__', MethodRef(class_name='Value', method_name='__mul__'): 'Value.__mul__', MethodRef(class_name='Value', method_name='__add__'): 'Value.__add__', MethodRef(class_name='Value', method_name='__ne__'): '!=', ClassMethodRef(class_name='Values', method_name='__init__'): 'Values.__init__', MethodRef(class_name='Values', method_name='__getitem__'): 'Values.__getitem__', MethodRef(class_name='Values', method_name='length'): 'Values.length', MethodRef(class_name='Values', method_name='concat'): 'Values.concat', MethodRef(class_name='Values', method_name='__ne__'): '!=', MethodRef(class_name='NDArray', method_name='__getitem__'): 'NDArray.__getitem__', MethodRef(class_name='NDArray', method_name='shape'): 'NDArray.shape', MethodRef(class_name='NDArray', method_name='__ne__'): '!=', FunctionRef(name='arange'): 'arange'}, _egg_sort_to_type_ref={'Value': JustTypeRef(name='Value', args=()), 'Values': JustTypeRef(name='Values', args=()), 'Vec[Value]': JustTypeRef(name='Vec', args=(JustTypeRef(name='Value', args=()),)), 'NDArray': JustTypeRef(name='NDArray', args=())}, _type_ref_to_egg_sort={JustTypeRef(name='Value', args=()): 'Value', JustTypeRef(name='Values', args=()): 'Values', JustTypeRef(name='Vec', args=(JustTypeRef(name='Value', args=()),)): 'Vec[Value]', JustTypeRef(name='NDArray', args=()): 'NDArray'})))], _mod_decls=ModuleDeclarations(_decl=Declarations(_functions={}, _classes={}, _constants={}, _egg_fn_to_callable_refs=defaultdict(, {}), _callable_ref_to_egg_fn={}, _egg_sort_to_type_ref={}, _type_ref_to_egg_sort={})))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"Values(Vec.empty().push(Value(10)))"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"egraph = EGraph([ndarray_mod])\n",
"ten = egraph.let(\"ten\", arange(Value(10)))\n",
"ten_shape = ten.shape()\n",
"egraph.register(ten_shape)\n",
"\n",
"egraph.run(20)\n",
"egraph.display()\n",
"egraph.extract(ten_shape)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "7a0d38d8-085a-4c61-ab23-06247c7171ef",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"EGraph(_flatted_deps=[Module(_flatted_deps=[], _mod_decls=ModuleDeclarations(_decl=Declarations(_functions={'arange': FunctionDecl(arg_types=(TypeRefWithVars(name='Value', args=()),), return_type=TypeRefWithVars(name='NDArray', args=()), var_arg_type=None)}, _classes={'Value': ClassDecl(methods={'__mul__': FunctionDecl(arg_types=(TypeRefWithVars(name='Value', args=()), TypeRefWithVars(name='Value', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), '__add__': FunctionDecl(arg_types=(TypeRefWithVars(name='Value', args=()), TypeRefWithVars(name='Value', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None)}, class_methods={'__init__': FunctionDecl(arg_types=(TypeRefWithVars(name='i64', args=()),), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None)}, class_variables={}, n_type_vars=0), 'Values': ClassDecl(methods={'__getitem__': FunctionDecl(arg_types=(TypeRefWithVars(name='Values', args=()), TypeRefWithVars(name='Value', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), 'length': FunctionDecl(arg_types=(TypeRefWithVars(name='Values', args=()),), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), 'concat': FunctionDecl(arg_types=(TypeRefWithVars(name='Values', args=()), TypeRefWithVars(name='Values', args=())), return_type=TypeRefWithVars(name='Values', args=()), var_arg_type=None)}, class_methods={'__init__': FunctionDecl(arg_types=(TypeRefWithVars(name='Vec', args=(TypeRefWithVars(name='Value', args=()),)),), return_type=TypeRefWithVars(name='Values', args=()), var_arg_type=None)}, class_variables={}, n_type_vars=0), 'NDArray': ClassDecl(methods={'__getitem__': FunctionDecl(arg_types=(TypeRefWithVars(name='NDArray', args=()), TypeRefWithVars(name='Values', args=())), return_type=TypeRefWithVars(name='Value', args=()), var_arg_type=None), 'shape': FunctionDecl(arg_types=(TypeRefWithVars(name='NDArray', args=()),), return_type=TypeRefWithVars(name='Values', args=()), var_arg_type=None)}, class_methods={}, class_variables={}, n_type_vars=0)}, _constants={}, _egg_fn_to_callable_refs=defaultdict(, {'Value.__init__': {ClassMethodRef(class_name='Value', method_name='__init__')}, 'Value.__mul__': {MethodRef(class_name='Value', method_name='__mul__')}, 'Value.__add__': {MethodRef(class_name='Value', method_name='__add__')}, '!=': {MethodRef(class_name='Value', method_name='__ne__'), MethodRef(class_name='NDArray', method_name='__ne__'), MethodRef(class_name='Values', method_name='__ne__')}, 'Values.__init__': {ClassMethodRef(class_name='Values', method_name='__init__')}, 'Values.__getitem__': {MethodRef(class_name='Values', method_name='__getitem__')}, 'Values.length': {MethodRef(class_name='Values', method_name='length')}, 'Values.concat': {MethodRef(class_name='Values', method_name='concat')}, 'NDArray.__getitem__': {MethodRef(class_name='NDArray', method_name='__getitem__')}, 'NDArray.shape': {MethodRef(class_name='NDArray', method_name='shape')}, 'arange': {FunctionRef(name='arange')}}), _callable_ref_to_egg_fn={ClassMethodRef(class_name='Value', method_name='__init__'): 'Value.__init__', MethodRef(class_name='Value', method_name='__mul__'): 'Value.__mul__', MethodRef(class_name='Value', method_name='__add__'): 'Value.__add__', MethodRef(class_name='Value', method_name='__ne__'): '!=', ClassMethodRef(class_name='Values', method_name='__init__'): 'Values.__init__', MethodRef(class_name='Values', method_name='__getitem__'): 'Values.__getitem__', MethodRef(class_name='Values', method_name='length'): 'Values.length', MethodRef(class_name='Values', method_name='concat'): 'Values.concat', MethodRef(class_name='Values', method_name='__ne__'): '!=', MethodRef(class_name='NDArray', method_name='__getitem__'): 'NDArray.__getitem__', MethodRef(class_name='NDArray', method_name='shape'): 'NDArray.shape', MethodRef(class_name='NDArray', method_name='__ne__'): '!=', FunctionRef(name='arange'): 'arange'}, _egg_sort_to_type_ref={'Value': JustTypeRef(name='Value', args=()), 'Values': JustTypeRef(name='Values', args=()), 'Vec[Value]': JustTypeRef(name='Vec', args=(JustTypeRef(name='Value', args=()),)), 'NDArray': JustTypeRef(name='NDArray', args=())}, _type_ref_to_egg_sort={JustTypeRef(name='Value', args=()): 'Value', JustTypeRef(name='Values', args=()): 'Values', JustTypeRef(name='Vec', args=(JustTypeRef(name='Value', args=()),)): 'Vec[Value]', JustTypeRef(name='NDArray', args=()): 'NDArray'})))], _mod_decls=ModuleDeclarations(_decl=Declarations(_functions={}, _classes={}, _constants={}, _egg_fn_to_callable_refs=defaultdict(, {'vec-empty': set(), 'Value.__init__': set(), 'vec-push': set(), 'Values.__init__': set(), 'arange': set(), 'NDArray.__getitem__': set()}), _callable_ref_to_egg_fn={}, _egg_sort_to_type_ref={}, _type_ref_to_egg_sort={})))"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"Value(7)"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ten_indexed = ten[Values(Vec(Value(7)))]\n",
"egraph.register(ten_indexed)\n",
"\n",
"egraph.run(20)\n",
"\n",
"egraph.display()\n",
"egraph.extract(ten_indexed)"
]
},
{
"cell_type": "markdown",
"id": "956f439e-80e6-4ff3-adb2-52a784ee5902",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Any user can try it now\n"
]
},
{
"cell_type": "markdown",
"id": "07bece32-f436-4c6b-8d8a-21305673b376",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"#### 2. Someone else decides to implement a cross product library\n"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "0eecf615-36c1-44a6-b122-743f66bc997f",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [],
"source": [
"cross_mod = Module([ndarray_mod])\n",
"\n",
"\n",
"@cross_mod.function\n",
"def cross(l: NDArray, r: NDArray) -> NDArray: ...\n",
"\n",
"\n",
"@cross_mod.register\n",
"def _cross(l: NDArray, r: NDArray, idx: Values):\n",
" yield rewrite(cross(l, r).shape()).to(l.shape().concat(r.shape()))\n",
" yield rewrite(cross(l, r)[idx]).to(l[idx] * r[idx])"
]
},
{
"cell_type": "markdown",
"id": "d1a60f2d-5669-4c05-be64-68f525eae9e4",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Someone decides to add some functionality\n",
"- Multiplicative cross product\n",
"- Shape is concatation, index is product of each matrix at that index\n",
"- Mathematical definition\n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "a7127ee6-2d17-4460-942a-b303277d8f81",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"Values(Vec.empty().push(Value(11)).push(Value(10)))"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"egraph = EGraph([cross_mod])\n",
"egraph.simplify(cross(arange(Value(10)), arange(Value(11))).shape(), 10)"
]
},
{
"cell_type": "markdown",
"id": "944b5b6c-7583-4c75-a740-0d75a835c522",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"#### 3. I write my wonderful data science application using it\n"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "14fde972-d358-4324-bc1e-35b3591184cc",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"Value(100)"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def my_special_app(x: Value) -> Value:\n",
" return cross(arange(x), arange(x))[Values(Vec(x))]\n",
"\n",
"\n",
"egraph = EGraph([cross_mod])\n",
"\n",
"egraph.simplify(my_special_app(Value(10)), 10)"
]
},
{
"cell_type": "markdown",
"id": "9f39ea1c-689d-4f5c-9fc3-bb8bd79e44f7",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Different person installs cross module\n",
"- Implements application using their complicated algorithm\n"
]
},
{
"cell_type": "markdown",
"id": "6967505c-48f0-45ad-837e-b6f2fdc99921",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
".... but its too slow...\n"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "12a435f1-5059-43d0-8a22-001bcf2ce19f",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [],
"source": [
"for i in range(100):\n",
" egraph.simplify(my_special_app(Value(i)), 10)"
]
},
{
"cell_type": "markdown",
"id": "06d04938-f7db-4094-a249-e778d199c3db",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Too slow in inner loop\n",
"- Is there a way we could optimize it\n"
]
},
{
"cell_type": "markdown",
"id": "8ba13f96-2333-43fd-a467-c18cf0daa9e4",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"#### 4. Someone else writes a library for delayed execution\n"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "17e9ad75-f829-4a0c-bf8f-e9047cd0e177",
"metadata": {
"editable": true,
"scrolled": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"Ellipsis"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"py_mod = Module([ndarray_mod])\n",
"\n",
"\n",
"@py_mod.function\n",
"def py_value(s: StringLike) -> Value: ..."
]
},
{
"cell_type": "markdown",
"id": "51cf4880-4576-40a7-8648-491218534d8f",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- While this is happening, someone else, based on the original module, wrote a different execution semantics\n",
"- Builds up expression string instead of trying to evaluate eagerly\n"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "5ce8398b-ffac-4bc9-bb53-52c039f35093",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"outputs": [],
"source": [
"@py_mod.register\n",
"def _py_value(l: String, r: String):\n",
" yield rewrite(py_value(l) + py_value(r)).to(py_value(join(l, \" + \", r)))\n",
" yield rewrite(py_value(l) * py_value(r)).to(py_value(join(l, \" * \", r)))\n",
"\n",
"\n",
"@py_mod.function\n",
"def py_values(s: StringLike) -> Values: ...\n",
"\n",
"\n",
"@py_mod.register\n",
"def _py_values(l: String, r: String):\n",
" yield rewrite(py_values(l)[py_value(r)]).to(py_value(join(l, \"[\", r, \"]\")))\n",
" yield rewrite(py_values(l).length()).to(py_value(join(\"len(\", l, \")\")))\n",
" yield rewrite(py_values(l).concat(py_values(r))).to(py_values(join(l, \" + \", r)))\n",
"\n",
"\n",
"@py_mod.function\n",
"def py_ndarray(s: StringLike) -> NDArray: ...\n",
"\n",
"\n",
"@py_mod.register\n",
"def _py_ndarray(l: String, r: String):\n",
" yield rewrite(py_ndarray(l)[py_values(r)]).to(py_value(join(l, \"[\", r, \"]\")))\n",
" yield rewrite(py_ndarray(l).shape()).to(py_values(join(l, \".shape\")))\n",
" yield rewrite(arange(py_value(l))).to(py_ndarray(join(\"np.arange(\", l, \")\")))"
]
},
{
"cell_type": "markdown",
"id": "85e7ceba-65f2-4766-8673-2d4dfa5bab96",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"#### 5. I can use it jit compile my application!\n"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "96bb493e-74a1-41a1-911e-a67ac31d92e5",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"py_value(\"x * x\")"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"egraph = EGraph([cross_mod, py_mod])\n",
"egraph.simplify(my_special_app(py_value(\"x\")), 10)"
]
},
{
"cell_type": "markdown",
"id": "5e9b8c74-42df-409c-ac74-66c605e46035",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- I pull in third party library\n",
"- Add it to my e-graph\n",
"- Now I can compile lazily\n",
"- py_mod never needed to know about cross product, works with it\n"
]
},
{
"cell_type": "markdown",
"id": "b5abc145-33bf-4093-ae7d-5dd4a60bc6ae",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"source": [
"... and add support for jit compilation for the other library I am using, without changing either library:\n"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "14cec5ba-b96e-46ef-853a-ddbe4ccfb37f",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"outputs": [],
"source": [
"@egraph.register\n",
"def _(l: String, r: String):\n",
" yield rewrite(cross(py_ndarray(l), py_ndarray(r))).to(py_ndarray(join(\"np.multiply.outer(\", l, \", \", r, \")\")))"
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "10190128-8d45-4722-8977-4855a0e44be9",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "skip"
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'big_graph.svg'"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"egraph.run(20)\n",
"egraph.graphviz().render(outfile=\"big_graph.svg\", format=\"svg\")"
]
},
{
"cell_type": "markdown",
"id": "193b6cf7-01b1-4446-8664-04d622fb0070",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"#### Takeaways...\n",
"\n",
"_...from this totally realistic example._\n",
"\n",
"- Declerative nature of `egglog` could facilitate decantralized library collaboration and experimentation.\n",
" - Focus on types over values for library authors encourages interoperability.\n",
"- Pushing power down, empowering users and library authors\n",
"- Could allow greater collaboration between PL community and data science library community in Python\n"
]
},
{
"cell_type": "markdown",
"id": "2daa57bf-0078-4edf-a4bf-a6586bd56968",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"## How does it work?\n"
]
},
{
"cell_type": "markdown",
"id": "54f61359-111b-45f8-b7b2-e54d21aa2ccd",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Will show how some examples translate\n"
]
},
{
"cell_type": "markdown",
"id": "673a0c32-2b21-431e-b5fc-6d8b29abf6a5",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"source": [
"### Sorts, expressions, and functions\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "33d34134-78c9-472b-b364-479c3aca3b23",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%egglog graph\n",
"(datatype Math\n",
" (Num i64)\n",
" (Var String)\n",
" (Add Math Math)\n",
" (Mul Math Math))\n",
"\n",
"(define expr1 (Mul (Num 2) (Add (Var \"x\") (Num 3))))\n",
"(define expr2 (Add (Num 6) (Mul (Num 2) (Var \"x\"))))"
]
},
{
"cell_type": "markdown",
"id": "1c0f4ee3-f5bd-49e7-b2a6-81088809eb08",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- User defined sorts\n",
"- Expressions\n",
" - expr1 and expr2 in their own e-classes, we haven't ran any rules\n",
"- `%%egglog` magic, Writing egglog in Notebook, graphs, output inline.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "4c774ee4-722c-4a3a-b61b-37733b435948",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"EGraph(_flatted_deps=[], _mod_decls=ModuleDeclarations(_decl=Declarations(_functions={}, _classes={'Num': ClassDecl(methods={'__add__': FunctionDecl(arg_types=(TypeRefWithVars(name='Num', args=()), TypeRefWithVars(name='Num', args=())), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None), '__mul__': FunctionDecl(arg_types=(TypeRefWithVars(name='Num', args=()), TypeRefWithVars(name='Num', args=())), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None)}, class_methods={'var': FunctionDecl(arg_types=(TypeRefWithVars(name='String', args=()),), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None), '__init__': FunctionDecl(arg_types=(TypeRefWithVars(name='i64', args=()),), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None)}, class_variables={}, n_type_vars=0)}, _constants={}, _egg_fn_to_callable_refs=defaultdict(, {'Num.var': {ClassMethodRef(class_name='Num', method_name='var')}, 'Num.__init__': {ClassMethodRef(class_name='Num', method_name='__init__')}, 'Num.__add__': {MethodRef(class_name='Num', method_name='__add__')}, 'Num.__mul__': {MethodRef(class_name='Num', method_name='__mul__')}, '!=': {MethodRef(class_name='Num', method_name='__ne__')}}), _callable_ref_to_egg_fn={ClassMethodRef(class_name='Num', method_name='var'): 'Num.var', ClassMethodRef(class_name='Num', method_name='__init__'): 'Num.__init__', MethodRef(class_name='Num', method_name='__add__'): 'Num.__add__', MethodRef(class_name='Num', method_name='__mul__'): 'Num.__mul__', MethodRef(class_name='Num', method_name='__ne__'): '!='}, _egg_sort_to_type_ref={'Num': JustTypeRef(name='Num', args=())}, _type_ref_to_egg_sort={JustTypeRef(name='Num', args=()): 'Num'})))"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"egraph = EGraph()\n",
"\n",
"\n",
"@egraph.class_\n",
"class Num(Expr):\n",
" @classmethod\n",
" def var(cls, name: StringLike) -> Num: ...\n",
"\n",
" def __init__(self, value: i64Like) -> None: ...\n",
"\n",
" def __add__(self, other: Num) -> Num: ...\n",
"\n",
" def __mul__(self, other: Num) -> Num: ...\n",
"\n",
"\n",
"expr1 = egraph.let(\"expr1\", Num(2) * (Num.var(\"x\") + Num(3)))\n",
"expr2 = egraph.let(\"expr2\", Num(6) + Num(2) * Num.var(\"x\"))\n",
"egraph"
]
},
{
"cell_type": "markdown",
"id": "1e32a7c4-fbb7-4dd2-8af1-c7fd22a3ca86",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Re-use existing Python class and functions\n",
" - Humans and computers to understand the typing semantics\n",
" - Humans read `__init__` and `__add__`.\n",
" - Static type checkers. `Num(\"String\")` it won't work.\n",
" - Static type checking drives much of the API design of the library\n",
"- Operator overloading support infix operators\n",
"- Names generated based on classes\n",
" - Same operator on different types compile to different function with different signature\n"
]
},
{
"cell_type": "markdown",
"id": "6efa3a03-cce0-4075-a767-84da0a7aec86",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"### Rewrite rules and checks\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "99cb645c-f006-4e9e-bc39-762be3fa5d61",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%egglog graph continue\n",
"(rewrite (Add a b)\n",
" (Add b a))\n",
"(rewrite (Mul a (Add b c))\n",
" (Add (Mul a b) (Mul a c)))\n",
"(rewrite (Add (Num a) (Num b))\n",
" (Num (+ a b)))\n",
"(rewrite (Mul (Num a) (Num b))\n",
" (Num (* a b)))\n",
"\n",
"(run 10)\n",
"(check (= expr1 expr2))"
]
},
{
"cell_type": "markdown",
"id": "0f5ef7d7-e043-40ee-a334-2e9b34a7b33f",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- See equivalent, in same e-class now\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a7f568af-5655-4926-bb93-5c2415f267cb",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/html": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"EGraph(_flatted_deps=[], _mod_decls=ModuleDeclarations(_decl=Declarations(_functions={}, _classes={'Num': ClassDecl(methods={'__add__': FunctionDecl(arg_types=(TypeRefWithVars(name='Num', args=()), TypeRefWithVars(name='Num', args=())), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None), '__mul__': FunctionDecl(arg_types=(TypeRefWithVars(name='Num', args=()), TypeRefWithVars(name='Num', args=())), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None)}, class_methods={'var': FunctionDecl(arg_types=(TypeRefWithVars(name='String', args=()),), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None), '__init__': FunctionDecl(arg_types=(TypeRefWithVars(name='i64', args=()),), return_type=TypeRefWithVars(name='Num', args=()), var_arg_type=None)}, class_variables={}, n_type_vars=0)}, _constants={}, _egg_fn_to_callable_refs=defaultdict(, {'Num.var': {ClassMethodRef(class_name='Num', method_name='var')}, 'Num.__init__': {ClassMethodRef(class_name='Num', method_name='__init__')}, 'Num.__add__': {MethodRef(class_name='Num', method_name='__add__')}, 'Num.__mul__': {MethodRef(class_name='Num', method_name='__mul__')}, '!=': {MethodRef(class_name='Num', method_name='__ne__')}}), _callable_ref_to_egg_fn={ClassMethodRef(class_name='Num', method_name='var'): 'Num.var', ClassMethodRef(class_name='Num', method_name='__init__'): 'Num.__init__', MethodRef(class_name='Num', method_name='__add__'): 'Num.__add__', MethodRef(class_name='Num', method_name='__mul__'): 'Num.__mul__', MethodRef(class_name='Num', method_name='__ne__'): '!='}, _egg_sort_to_type_ref={'Num': JustTypeRef(name='Num', args=())}, _type_ref_to_egg_sort={JustTypeRef(name='Num', args=()): 'Num'})))"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@egraph.register\n",
"def _(a: Num, b: Num, c: Num, i: i64, j: i64):\n",
" yield rewrite(a + b).to(b + a)\n",
" yield rewrite(a * (b + c)).to((a * b) + (a * c))\n",
" yield rewrite(Num(i) + Num(j)).to(Num(i + j))\n",
" yield rewrite(Num(i) * Num(j)).to(Num(i * j))\n",
"\n",
"\n",
"egraph.run(10)\n",
"egraph.check(eq(expr1).to(expr2))\n",
"egraph"
]
},
{
"cell_type": "markdown",
"id": "208b3db5-e2d7-48f8-afaf-e637c132691e",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Similar in Python, rewrite rules, run, check\n",
"- Notice that all vars need types, unlike inferred in egglog\n",
" - Both for static type checkers to verify\n",
" - And for runtime to know what methods\n"
]
},
{
"cell_type": "markdown",
"id": "cd095e65-074d-44d1-b180-060aa34a92d7",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"### Extracting lowest cost expression\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "33644682-1e44-4d30-9ed8-9126ff00582b",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Extracted with cost 8: (Add (Mul (Num 2) (Var \"x\")) (Num 6))\n"
]
}
],
"source": [
"%%egglog continue output\n",
"(extract expr1)"
]
},
{
"cell_type": "markdown",
"id": "b8932052-014c-4e93-b3f3-b8e112b77811",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Extract lowest cost expr\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "63308ba8-f31a-4809-bf4a-c6816cbcdc00",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"(Num(2) * Num.var(\"x\")) + Num(6)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"egraph.extract(expr1)"
]
},
{
"cell_type": "markdown",
"id": "386bb340-06a1-4c86-88ba-33201d4214ca",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- get back expr object\n",
"- Str representation is Python syntax\n"
]
},
{
"cell_type": "markdown",
"id": "f5e0d729-028a-4af4-947f-62a2e109d28a",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"### Multipart Rules\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "0bea7d72-bf5c-47c4-95ac-e2926ad33cf3",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%egglog graph\n",
"(function fib (i64) i64)\n",
"\n",
"(set (fib 0) 0)\n",
"(set (fib 1) 1)\n",
"(rule ((= f0 (fib x))\n",
" (= f1 (fib (+ x 1))))\n",
" ((set (fib (+ x 2)) (+ f0 f1))))\n",
"\n",
"(run 7)\n",
"(check (= (fib 7) 13))"
]
},
{
"cell_type": "markdown",
"id": "2dc964ef-f5ee-4b7c-80a5-5a11b848ece7",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Rule that depend on facts and execute actions\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "7f4bd927-ccd6-4240-aec0-123d93a782c5",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [],
"source": [
"fib_egraph = EGraph()\n",
"\n",
"\n",
"@fib_egraph.function\n",
"def fib(x: i64Like) -> i64: ...\n",
"\n",
"\n",
"@fib_egraph.register\n",
"def _(f0: i64, f1: i64, x: i64):\n",
" yield set_(fib(0)).to(i64(1))\n",
" yield set_(fib(1)).to(i64(1))\n",
" yield rule(\n",
" eq(f0).to(fib(x)),\n",
" eq(f1).to(fib(x + 1)),\n",
" ).then(set_(fib(x + 2)).to(f0 + f1))\n",
"\n",
"\n",
"fib_egraph.run(7)\n",
"fib_egraph.check(eq(fib(7)).to(i64(21)))"
]
},
{
"cell_type": "markdown",
"id": "df10f994-3e91-4663-807a-877f3dd50e04",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- `set_` and and `eq` both type safe. Required builder syntax\n"
]
},
{
"cell_type": "markdown",
"id": "5b4d8cba-3e99-4e61-afe0-c18724c0d358",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"### Include & Modules\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "68179969-f9e0-43ef-851f-bd2295bd5a21",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting path.egg\n"
]
}
],
"source": [
"%%writefile path.egg\n",
"(relation path (i64 i64))\n",
"(relation edge (i64 i64))\n",
"\n",
"(rule ((edge x y))\n",
" ((path x y)))\n",
"\n",
"(rule ((path x y) (edge y z))\n",
" ((path x z)))"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "39273d7a-926c-4022-8514-8f5b297cfdfb",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [],
"source": [
"%%egglog\n",
"(include \"path.egg\")\n",
"(edge 1 2)\n",
"(edge 2 3)\n",
"(edge 3 4)\n",
"(run 3)\n",
"(check (path 1 3))"
]
},
{
"cell_type": "markdown",
"id": "ac6aec27-654d-4cf5-bb4e-3341074ffa64",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Include another file for re-useability\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "ad42a38e-ad5d-476b-8ecb-a04a56a618d1",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [],
"source": [
"mod = Module()\n",
"path = mod.relation(\"path\", i64, i64)\n",
"edge = mod.relation(\"edge\", i64, i64)\n",
"\n",
"\n",
"@mod.register\n",
"def _(x: i64, y: i64, z: i64):\n",
" yield rule(edge(x, y)).then(path(x, y))\n",
" yield rule(path(x, y), edge(y, z)).then(path(x, z))"
]
},
{
"cell_type": "markdown",
"id": "93b04224-cf27-41aa-b3e8-4fb5f5bb1508",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Modules same in Python\n",
"- Supports defining rules, etc, but doesn't actually run them, just builds up commands\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "8f8932a2-f6ea-4a7c-a4f3-41e4455aa33e",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "fragment"
},
"tags": []
},
"outputs": [],
"source": [
"egraph = EGraph([mod])\n",
"egraph.register(edge(1, 2), edge(2, 3), edge(3, 4))\n",
"egraph.run(3)\n",
"egraph.check(path(1, 3))"
]
},
{
"cell_type": "markdown",
"id": "99f0f9f2-297d-45ac-bc84-053f591d894d",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "notes"
},
"tags": []
},
"source": [
"- Then when we depend on them, it will run those commands first.\n",
"- Allows distribution of code and others to re-use it, using existing Python import mechanisms.\n"
]
},
{
"cell_type": "markdown",
"id": "52e042f3-adc3-466c-bea0-0a00cd037377",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"## Possible next steps?\n",
"\n",
"- Try getting toehold in existing library (like Ibis) to see if constrained egglog approach can still be useful.\n",
" - Add support for Python objects as builtin sort.\n",
"- Upstream egglog improvements which could help with reuse\n",
" - First class functions (would help with implementing things like reductions, mapping)\n",
" - User defined generic sorts (i.e. an array type agnostic to inner values)\n"
]
},
{
"cell_type": "markdown",
"id": "2b67e0b8-3c8d-45a5-a041-4293da7aed9e",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"source": [
"## Thank you!\n",
"\n",
"```bash\n",
"pip install egglog\n",
"```\n",
"\n",
"```python\n",
"from egglog import *\n",
"\n",
"egraph = EGraph()\n",
"...\n",
"```\n",
"\n",
"_Come say hello at [github.com/egraphs-good/egglog-python](https://github.com/egraphs-good/egglog-python)!_\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
},
"mystnb": {
"execution_mode": "off"
}
},
"nbformat": 4,
"nbformat_minor": 5
}