skeleton code

Edits
Start parameterised handlers.
2026-04-28 15:06:29 +01:00 · 2021-02-04 00:25:50 +00:00 · 2021-02-03 22:29:38 +00:00 · 2021-02-03 22:26:47 +00:00 · 2021-02-03 18:04:15 +00:00
3 changed files with 211 additions and 56 deletions
--- a/macros.tex
+++ b/macros.tex
@@ -41,6 +41,7 @@
 \newcommand{\EffCalc}{\ensuremath{\lambda_{\mathsf{eff}}}\xspace}
 \newcommand{\UCalc}{\ensuremath{\lambda_{\mathsf{u}}}\xspace}
 \newcommand{\param}{\ensuremath{\ddagger}}
+\newcommand{\HPCalc}{\ensuremath{\lambda_{\mathsf{h^\param}}}\xspace}

 %%
 %% Calculi terms and types type-setting.
@@ -448,6 +449,12 @@
 \newcommand{\paste}{\dec{paste}}
 \newcommand{\sed}{\dec{sed}}
 \newcommand{\printTable}{\dec{renderTable}}
+\newcommand{\timesharee}{\dec{timeshare2}}
+\newcommand{\Co}{\dec{Co}}
+\newcommand{\UFork}{\dec{UFork}}
+\newcommand{\ufork}{\dec{ufork}}
+\newcommand{\Wait}{\dec{Wait}}
+\newcommand{\scheduler}{\dec{scheduler}}

 %%
 %% Some control operators
--- a/thesis.bib
+++ b/thesis.bib
@@ -1712,6 +1712,23 @@
  OPTaddress = {Boston, MA, USA}
 }

+@book{PizziniBMG20,
+  author = {Ken Pizzini and Paolo Bonzini and Jim Meyering and Assaf Gordon},
+  @Comment = {David MacKenzie
+  @Comment           and Jim Meyering
+  @Comment           and Ross Paterson
+  @Comment           and François Pinard
+  @Comment           and Karl Berry
+  @Comment           and Brian Youmans
+  @Comment           and Richard Stallman},
+  title  = {{GNU} sed, a stream editor},
+  note   = {For version 4.8},
+  month  = jan,
+  year   = 2020,
+  publisher = {Free Software Foundation},
+  OPTaddress = {Boston, MA, USA}
+}
+
 # Expressiveness
@inproceedings{Felleisen90,
  author    = {Matthias Felleisen},
--- a/thesis.tex
+++ b/thesis.tex
@@ -7151,7 +7151,7 @@ invoked with the resumption of the producer along with a thunk that
 applies the consumer's resumption to the yielded value.
 %
 For aesthetics, we define a right-associative infix alias for pipe:
-$p \mid c \defas \Pipe\,\Record{p;c}$.
+$p \mid c \defas \lambda\Unit.\Pipe\,\Record{p;c}$.

 Let us put the pipe operator to use by performing a simple string
 frequency analysis on a file. We will implement the analysis as a
@@ -7245,6 +7245,16 @@ the character was nil in which case the process
 terminates. Alternatively, if the character was a newline the function
 applies itself recursively with $n$ decremented by one. Otherwise it
 applies itself recursively with the original $n$.
+
+The $\head$ filter does not transform the shape of its data stream. It
+both awaits and yields a character. However, the awaits and yields
+need not operate on the same type within the same filter, meaning we
+can implement a filter that transforms the shape of the data. Let us
+implement a variation of the GNU coreutil \emph{paste} which merges
+lines of files~\cite[Section~8.2]{MacKenzieMPPBYS20}. Our
+implementation will join characters in its input stream into strings
+separated by spaces and newlines such that the string frequency
+analysis utility need not operate on the low level of characters.
 %
 \[
  \bl
@@ -7264,6 +7274,33 @@ applies itself recursively with the original $n$.
  \el
 \]
 %
+The heavy-lifting is delegated to the recursive function $paste'$
+which accepts two parameters: 1) the next character in the input
+stream, and 2) a string buffer for building the output string. The
+function is initially applied to the first character from the stream
+(returned by the invocation of $\Await$) and the empty string
+buffer. The function $paste'$ is defined by pattern matching on the
+character parameter. The first three definitions handle the special
+cases when the received character is nil, newline, and space,
+respectively. If the character is nil, then the function yields the
+contents of the string buffer followed by a string with containing
+only the nil character. If the character is a newline, then the
+function yields the string buffer followed by a string containing the
+newline character. Afterwards the function applies itself recursively
+with the next character from the input stream and an empty string
+buffer. The case when the character is a space is similar to the
+previous case except that it does not yield a newline string. The
+final definition simply concatenates the character onto the string
+buffer and recurses.
+
+Another useful filter is the GNU stream editor abbreviated
+\emph{sed}~\cite{PizziniBMG20}. It is an advanced text processing
+editor, whose complete functionality we will not attempt to replicate
+here. We will just implement the ability to replace a string by
+another. This will be useful for normalising the input stream to the
+frequency analysis utility, e.g. decapitalise words, remove unwanted
+characters, etc.
+%
 \[
  \bl
    \sed : \Record{\String;\String} \to \UnitType \eff \{\Await : \UnitType \opto \String;\Yield : \String \opto \UnitType\}\\
@@ -7276,6 +7313,16 @@ applies itself recursively with the original $n$.
  \el
 \]
 %
+The function $\sed$ takes two string arguments. The first argument is
+the string to be replaced in the input stream, and the second argument
+is the replacement. The function first awaits the next string from the
+input stream, then it checks whether the received string is the same
+as $target$ in which case it yields the replacement $str'$ and
+recurses. Otherwise it yields the received string and recurses.
+
+Now let us implement the string frequency analysis utility. It work on
+strings and count the occurrences of each string in the input stream.
+%
 \[
  \bl
    \freq : \UnitType \to \UnitType \eff \{\Await : \UnitType \opto \String;\Yield : \List\,\Record{\String;\Int} \opto \UnitType\}\\
@@ -7300,20 +7347,50 @@ applies itself recursively with the original $n$.
  \el
 \]
 %
-\[
-  \bl
-    \intToString : \Int \to \String
-  \el
-\]
+The auxiliary recursive function $freq'$ implements the analysis. It
+takes two arguments: 1) the next string from the input stream, and 2)
+a table to keep track of how many times each string has occurred. The
+table is implemented as an association list indexed by strings. The
+function is initially applied to the first string from the input
+stream and the empty list. The function is defined by pattern matching
+on the string argument. The first definition handles the case when the
+input stream has been exhausted in which case the function yields the
+table. The other case is responsible for updating the entry associated
+with the string $str$ in the table $tbl$. There are two subcases to
+consider: 1) the string has not been seen before, thus a new entry
+will have to created; or 2) the string already has an entry in the
+table, thus the entry will have to be updated. We handle both cases
+simultaneously by making use of the handler $\faild$, where the
+default value accounts for the first subcase, and the computation
+accounts for the second. The computation attempts to lookup the entry
+associated with $str$ in $tbl$, if the lookup fails then $\faild$
+returns the default value, which is the original table augmented with
+an entry for $str$. If an entry already exists it gets incremented by
+one. The resulting table $tbl'$ is supplied to a recursive application
+of $freq'$.
+
+We need one more building block to complete the pipeline. The utility
+$\freq$ returns a value of type $\List~\Record{\String;\Int}$, we need
+a utility to render the value as a string in order to write it to a
+file.
 %
 \[
  \bl
    \printTable : \UnitType \to \UnitType \eff \{\Await : \UnitType \opto \List\,\Record{\String;\Int}\}\\
    \printTable\,\Unit \defas
-        \dec{map}\,\Record{\lambda\Record{s;i}.s \concat \strlit{:} \concat \intToString~i \concat \strlit{;};\Do\;\Await~\Unit}
+        \map\,\Record{\lambda\Record{s;i}.s \concat \strlit{:} \concat \intToString~i \concat \strlit{;};\Do\;\Await~\Unit}
  \el
 \]
 %
+The function performs one invocation of $\Await$ to receive the table,
+and then performs a $\map$ over the table. The function argument to
+$\map$ builds a string from the provided string-integer pair.
+%
+Here we make use of an auxiliary function,
+$\intToString : \Int \to \String$, that turns an integer into a
+string. The definition of this function is omitted here for brevity.
+%
+%
 % \[
 %   \bl
 %     \wc : \UnitType \to \UnitType \eff \{\Await : \UnitType \opto \Char;\Yield : \Int \opto \UnitType\}\\
@@ -7333,6 +7410,11 @@ applies itself recursively with the original $n$.
 %   \el
 % \]
 %
+
+We now have all the building blocks to construct a pipeline for
+performing string frequency analysis on a file. The following performs
+the analysis on the two first lines of Hamlet quote.
+%
 \[
  \ba{@{~}l@{~}l}
    &\bl
@@ -7343,13 +7425,14 @@ applies itself recursively with the original $n$.
         \qquad\qquad\status\,(\lambda\Unit.
                  \ba[t]{@{}l}
                  \quoteHamlet~\redirect~\strlit{hamlet};\\
-                  \Let\;cs \revto
+                  \Let\;p \revto
                  \bl
-                    (\lambda\Unit.\cat~\strlit{hamlet}) \mid (\lambda\Unit.\head~2) \mid \paste\\
-                    \mid (\lambda\Unit.\sed\,\Record{\strlit{be,};\strlit{live}}) \mid (\lambda\Unit.\sed\,\Record{\strlit{To};\strlit{to}})\\
+                    ~~(\lambda\Unit.\cat~\strlit{hamlet}) \mid (\lambda\Unit.\head~2) \mid \paste\\
+                    \mid (\lambda\Unit.\sed\,\Record{\strlit{be,};\strlit{be}}) \mid (\lambda\Unit.\sed\,\Record{\strlit{To};\strlit{to}})\\
+                    \mid (\lambda\Unit.\sed\,\Record{\strlit{question:};\strlit{question}})\\
                    \mid \freq \mid \printTable
                  \el\\
-                  \In\;(\lambda\Unit.\echo~cs)~\redirect~\strlit{analysis})})))}
+                  \In\;(\lambda\Unit.\echo~(p\,\Unit))~\redirect~\strlit{analysis})})))}
                 \ea
     \el \smallskip\\
     \reducesto^+&
@@ -7368,8 +7451,8 @@ applies itself recursively with the original $n$.
             \ba[t]{@{}l}
               \Record{2;
                 \ba[t]{@{}l@{}l}
-                   \texttt{"}&\texttt{to:2;live:2;or:1;not:1;\nl:2;that:1;is:1}\\
-                             &\texttt{the:1;question::1;"}},
+                   \texttt{"}&\texttt{to:2;be:2;or:1;not:1;\nl:2;that:1;is:1}\\
+                             &\texttt{the:1;question:1;"}},
                 \ea\\
               \Record{1;
                 \ba[t]{@{}l@{}l}
@@ -7385,35 +7468,48 @@ applies itself recursively with the original $n$.
  \ea
 \]
 %
-%
-% \[
-%   \bl
-%     \grep : \String \to \UnitType \eff \{\Await : \UnitType \opto \Char;\Yield : \Char \opto \UnitType\}\\
-%     \grep~str \defas
-%       \bl
-%         \match\,\Record{\Do\;\Await~\Unit; str; \nil}
-%       \el
-%   \el
-% \]
-%
+The pipeline gets bound to the variable $p$. The pipeline starts with
+call to $\cat$ which streams the contents of the file
+$\strlit{hamlet}$ to the process $\head$ applied to $2$, meaning it
+will only forward the first two lines of the file to its
+successor. The third process $\paste$ receives the first two lines one
+character at a time and joins the characters into strings delimited by
+whitespace. The next three instances of $\sed$ perform some string
+normalisation. The first instance removes the trailing comma from the
+string $\strlit{be,}$; the second normalises the capitalisation of the
+word ``to''; and the third removes the trailing colon from the string
+$\strlit{question:}$. The seventh process performs the frequency
+analysis and outputs a table, which is being rendered as a string by
+the eighth process. The output of the pipeline is supplied to the
+$\echo$ utility whose output is being redirected to a file named
+$\strlit{analysis}$. Contents of the file reside in location $2$ in
+the data region. Here we can see that the analysis has found that the
+words ``to'', ``be'', and the newline character ``$\nl$'' appear two
+times each, whilst the other words appear once each.

 \section{Parameterised handlers}
 \label{sec:unary-parameterised-handlers}

-Parameterised handlers are a useful idiom for handling stateful
-computations.
+Parameterised handlers are a variation of ordinary deep handlers with
+an embedded functional state cell. This state cell is only accessible
+locally within the handler. The use of state within the handler is
+opaque to both the ambient context and the context of the computation
+being handled. Semantically, parameterised handlers are defined as
+folds with state threading over computation trees.
+
+We take the deep handler calculus $\HCalc$ as our starting point and
+extend it with parameterised handlers to yield the calculus $\HPCalc$.

 \subsection{Syntax and static semantics}
+In addition to a computation, a parameterised handler also take a
+value as argument. This argument is the initial value of the state
+cell embedded inside the handler.
 %
-\begin{figure}
-  \begin{syntax}
-    & F          &::=& \cdots \mid \Record{C; A} \Rightarrow^\param D\\
-    & \delta     &::=& \cdots \mid \param\\
-    & M,N        &::=& \cdots \mid \ParamHandle\; M \;\With\; H^\param(W)\\
-    & H^\param   &::=& q^A.~H
-  \end{syntax}
-  \caption{Syntax extensions for parameterised handlers.}\label{fig:param-syntax}
-\end{figure}
+\begin{syntax}
+\slab{Handler\textrm{ }types}  & F          &::=& \cdots \mid \Record{C; A} \Rightarrow^\param D\\
+\slab{Computations}   & M,N        &::=& \cdots \mid \ParamHandle\; M \;\With\; H^\param(W)\\
+\slab{Parameterised\textrm{ }definitions}   & H^\param   &::=& q^A.~H
+\end{syntax}
 %
 Figure~\ref{fig:param-syntax} extends the syntax of
 $\HCalc$ with parameterised handlers. Syntactically a parameterised
@@ -7426,20 +7522,17 @@ definition is applied to a value $W$, which is the initial value of
 the parameter $q$.

 %
-\begin{figure}
 \begin{mathpar}
  % Handle
-  \inferrule*[Lab=\tylab{Param\textrm{-}Handle}]
+  \inferrule*[Lab=\tylab{Handle^\param}]
  {
    % \typ{\Gamma}{V : A} \\
    \typ{\Gamma}{M : C} \\
    \typ{\Gamma}{W : A} \\
-    \typ{\Gamma}{H^\param : \Record{C; A} \Harrow D}
+    \typ{\Gamma}{H^\param : \Record{C; A} \Harrow^\param D}
   }
   {\Gamma \vdash \ParamHandle \; M \; \With\; H^\param(W) : D}
-\end{mathpar}
-~
-\begin{mathpar}
+
 % Parameterised handler
  \inferrule*[Lab=\tylab{Handler^\param}]
    {{\bl
@@ -7448,14 +7541,11 @@ the parameter $q$.
       H = \{\Return\;x \mapsto M\} \uplus \{ \OpCase{\ell_i}{p_i}{r_i} \mapsto N_i \}_i
       \el}\\\\
       \typ{\Delta;\Gamma, q : A', x : A}{M : D}\\\\
-       [\typ{\Delta;\Gamma,q : A', p_i : A_i, r_i : B_i \to D}{N_i : D}]_i
+       [\typ{\Delta;\Gamma,q : A', p_i : A_i, r_i : \Record{B_i;A'} \to D}{N_i : D}]_i
    }
    {\typ{\Delta;\Gamma}{(q^{A'} . H) : \Record{C;A'} \Harrow^\param D}}
 \end{mathpar}
 %%
-  \caption{Typing rules for parameterised handlers.}\label{fig:param-static-semantics}
-\end{figure}
-%
 We require two additional rules to type parameterised handlers. The
 rules are given in Figure~\ref{fig:param-static-semantics}. The main
 differences between the \tylab{Handler} and \tylab{Handler^\param} are
@@ -7495,21 +7585,62 @@ handling the return of a computation is as follows.
 Both the return value $V$ and the parameter value $W$ are substituted
 into the return case body $N$ for their respective binders.

-\subsection{Lightweight concurrency}
+\subsection{Process scheduling}

-\begin{example}[Green threads]
-  \dhil{Example: Lightweight threads}
-\end{example}
+\[
+  \Co \defas \{\UFork : \UnitType \opto \Int; \Interrupt : \UnitType \opto \UnitType; \Wait : \Int \opto \UnitType\}
+\]

-% \section{Default handlers}
-% \label{sec:unary-default-handlers}
+\[
+  \bl
+    \dec{SchedState}~\alpha~\varepsilon \defas \Record{\List\,(\dec{Process}~\alpha~\varepsilon);\List~\alpha;\Int;\Int}\\
+    \dec{Process}~\alpha~\varepsilon \defas \dec{SchedState} \to \alpha \eff \varepsilon
+  \el
+\]

-% \chapter{N-ary handlers}
-% \label{ch:multi-handlers}
+\[
+  \bl
+    \scheduler : \Record{\alpha \eff \Co;\dec{SchedState}~\alpha~\Co} \Harrow^\param \List~\alpha\\
+    \scheduler \defas
+    \bl
+    \Record{ps;qs;done;pid;nextpid}.\\
+      \bl
+      \Return\;x \mapsto \Case\;qs \concat ps\;\{\\
+      \quad\bl
+          \nil       \mapsto x \cons done\\
+          \Record{pid';qs';resume} \cons ps' \mapsto resume\,\Record{ps';qs';done;pid;nextpid}\}
+        \el\\
+          \OpCase{\UFork}{\Unit}{resume} \mapsto\\
+            \quad\bl
+              \Let\;ps' \revto \Record{nextpid;\nil;\lambda s.resume~\Record{0;s}} \cons ps\;\In\\
+              resume\,\Record{nextpid;\Record{ps';qs;done;pid;nextpid+1}}
+            \el\\
+         \OpCase{\Interrupt}{\Unit}{resume} \mapsto\\
+           \quad\bl\Let\;\Record{pid';qs';resume'} \cons ps' \revto ps \concat [\Record{pid;qs;\lambda s.resume\,\Record{\Unit;s}}]\;\In\\
+             resume'\,\Record{ps';qs';done;pid';nextpid}
+             \el\\
+         \OpCase{\Wait}{pid'}{resume} \mapsto\\
+           \quad\bl
+             \Let\; q \revto \Record{qs;\lambda s.resume\,\Record{\Unit;s}}\;\In\\
+             \Let\;\Record{qs';resume'} \revto \lookup\,\Record{pid';ps}\;\In\\
+             \Let\;p \revto \Record{\Record{pid;q} \cons qs;resume'}\;\In\\
+             \Let\;s \revto \modify\,\Record{pid';p;ps}\;\In\\
+             
+           \el
+      \el
+     \el
+  \el
+\]

-% \section{Syntax and Static Semantics}
-% \section{Dynamic Semantics}
-% \section{Unifying deep and shallow handlers}
+\[
+  \bl
+    \timesharee : (\UnitType \to \alpha \eff \Co) \to \List~\alpha\\
+    \timesharee~m \defas
+      \bl
+        \ParamHandle\;m\,\Unit\;\With\; \scheduler~\Record{\nil;1;2}
+      \el
+  \el
+\]

 \section{Related work}
 \label{sec:unix-related-work}
Author	SHA1	Message	Date
Daniel Hillerström	f6613484ca	skeleton code	2021-02-04 00:25:50 +00:00
Daniel Hillerström	756ce77e5a	Edits	2021-02-03 22:29:38 +00:00
Daniel Hillerström	33781a94e2	Start parameterised handlers.	2021-02-03 22:26:47 +00:00
Daniel Hillerström	9fc66e6b62	Frequency	2021-02-03 18:04:15 +00:00