1
0
mirror of https://github.com/dhil/phd-dissertation synced 2026-04-28 15:06:29 +01:00

Compare commits

...

4 Commits

Author SHA1 Message Date
f6613484ca skeleton code 2021-02-04 00:25:50 +00:00
756ce77e5a Edits 2021-02-03 22:29:38 +00:00
33781a94e2 Start parameterised handlers. 2021-02-03 22:26:47 +00:00
9fc66e6b62 Frequency 2021-02-03 18:04:15 +00:00
3 changed files with 211 additions and 56 deletions

View File

@@ -41,6 +41,7 @@
\newcommand{\EffCalc}{\ensuremath{\lambda_{\mathsf{eff}}}\xspace}
\newcommand{\UCalc}{\ensuremath{\lambda_{\mathsf{u}}}\xspace}
\newcommand{\param}{\ensuremath{\ddagger}}
\newcommand{\HPCalc}{\ensuremath{\lambda_{\mathsf{h^\param}}}\xspace}
%%
%% Calculi terms and types type-setting.
@@ -448,6 +449,12 @@
\newcommand{\paste}{\dec{paste}}
\newcommand{\sed}{\dec{sed}}
\newcommand{\printTable}{\dec{renderTable}}
\newcommand{\timesharee}{\dec{timeshare2}}
\newcommand{\Co}{\dec{Co}}
\newcommand{\UFork}{\dec{UFork}}
\newcommand{\ufork}{\dec{ufork}}
\newcommand{\Wait}{\dec{Wait}}
\newcommand{\scheduler}{\dec{scheduler}}
%%
%% Some control operators

View File

@@ -1712,6 +1712,23 @@
OPTaddress = {Boston, MA, USA}
}
@book{PizziniBMG20,
author = {Ken Pizzini and Paolo Bonzini and Jim Meyering and Assaf Gordon},
@Comment = {David MacKenzie
@Comment and Jim Meyering
@Comment and Ross Paterson
@Comment and François Pinard
@Comment and Karl Berry
@Comment and Brian Youmans
@Comment and Richard Stallman},
title = {{GNU} sed, a stream editor},
note = {For version 4.8},
month = jan,
year = 2020,
publisher = {Free Software Foundation},
OPTaddress = {Boston, MA, USA}
}
# Expressiveness
@inproceedings{Felleisen90,
author = {Matthias Felleisen},

View File

@@ -7151,7 +7151,7 @@ invoked with the resumption of the producer along with a thunk that
applies the consumer's resumption to the yielded value.
%
For aesthetics, we define a right-associative infix alias for pipe:
$p \mid c \defas \Pipe\,\Record{p;c}$.
$p \mid c \defas \lambda\Unit.\Pipe\,\Record{p;c}$.
Let us put the pipe operator to use by performing a simple string
frequency analysis on a file. We will implement the analysis as a
@@ -7245,6 +7245,16 @@ the character was nil in which case the process
terminates. Alternatively, if the character was a newline the function
applies itself recursively with $n$ decremented by one. Otherwise it
applies itself recursively with the original $n$.
The $\head$ filter does not transform the shape of its data stream. It
both awaits and yields a character. However, the awaits and yields
need not operate on the same type within the same filter, meaning we
can implement a filter that transforms the shape of the data. Let us
implement a variation of the GNU coreutil \emph{paste} which merges
lines of files~\cite[Section~8.2]{MacKenzieMPPBYS20}. Our
implementation will join characters in its input stream into strings
separated by spaces and newlines such that the string frequency
analysis utility need not operate on the low level of characters.
%
\[
\bl
@@ -7264,6 +7274,33 @@ applies itself recursively with the original $n$.
\el
\]
%
The heavy-lifting is delegated to the recursive function $paste'$
which accepts two parameters: 1) the next character in the input
stream, and 2) a string buffer for building the output string. The
function is initially applied to the first character from the stream
(returned by the invocation of $\Await$) and the empty string
buffer. The function $paste'$ is defined by pattern matching on the
character parameter. The first three definitions handle the special
cases when the received character is nil, newline, and space,
respectively. If the character is nil, then the function yields the
contents of the string buffer followed by a string with containing
only the nil character. If the character is a newline, then the
function yields the string buffer followed by a string containing the
newline character. Afterwards the function applies itself recursively
with the next character from the input stream and an empty string
buffer. The case when the character is a space is similar to the
previous case except that it does not yield a newline string. The
final definition simply concatenates the character onto the string
buffer and recurses.
Another useful filter is the GNU stream editor abbreviated
\emph{sed}~\cite{PizziniBMG20}. It is an advanced text processing
editor, whose complete functionality we will not attempt to replicate
here. We will just implement the ability to replace a string by
another. This will be useful for normalising the input stream to the
frequency analysis utility, e.g. decapitalise words, remove unwanted
characters, etc.
%
\[
\bl
\sed : \Record{\String;\String} \to \UnitType \eff \{\Await : \UnitType \opto \String;\Yield : \String \opto \UnitType\}\\
@@ -7276,6 +7313,16 @@ applies itself recursively with the original $n$.
\el
\]
%
The function $\sed$ takes two string arguments. The first argument is
the string to be replaced in the input stream, and the second argument
is the replacement. The function first awaits the next string from the
input stream, then it checks whether the received string is the same
as $target$ in which case it yields the replacement $str'$ and
recurses. Otherwise it yields the received string and recurses.
Now let us implement the string frequency analysis utility. It work on
strings and count the occurrences of each string in the input stream.
%
\[
\bl
\freq : \UnitType \to \UnitType \eff \{\Await : \UnitType \opto \String;\Yield : \List\,\Record{\String;\Int} \opto \UnitType\}\\
@@ -7300,20 +7347,50 @@ applies itself recursively with the original $n$.
\el
\]
%
\[
\bl
\intToString : \Int \to \String
\el
\]
The auxiliary recursive function $freq'$ implements the analysis. It
takes two arguments: 1) the next string from the input stream, and 2)
a table to keep track of how many times each string has occurred. The
table is implemented as an association list indexed by strings. The
function is initially applied to the first string from the input
stream and the empty list. The function is defined by pattern matching
on the string argument. The first definition handles the case when the
input stream has been exhausted in which case the function yields the
table. The other case is responsible for updating the entry associated
with the string $str$ in the table $tbl$. There are two subcases to
consider: 1) the string has not been seen before, thus a new entry
will have to created; or 2) the string already has an entry in the
table, thus the entry will have to be updated. We handle both cases
simultaneously by making use of the handler $\faild$, where the
default value accounts for the first subcase, and the computation
accounts for the second. The computation attempts to lookup the entry
associated with $str$ in $tbl$, if the lookup fails then $\faild$
returns the default value, which is the original table augmented with
an entry for $str$. If an entry already exists it gets incremented by
one. The resulting table $tbl'$ is supplied to a recursive application
of $freq'$.
We need one more building block to complete the pipeline. The utility
$\freq$ returns a value of type $\List~\Record{\String;\Int}$, we need
a utility to render the value as a string in order to write it to a
file.
%
\[
\bl
\printTable : \UnitType \to \UnitType \eff \{\Await : \UnitType \opto \List\,\Record{\String;\Int}\}\\
\printTable\,\Unit \defas
\dec{map}\,\Record{\lambda\Record{s;i}.s \concat \strlit{:} \concat \intToString~i \concat \strlit{;};\Do\;\Await~\Unit}
\map\,\Record{\lambda\Record{s;i}.s \concat \strlit{:} \concat \intToString~i \concat \strlit{;};\Do\;\Await~\Unit}
\el
\]
%
The function performs one invocation of $\Await$ to receive the table,
and then performs a $\map$ over the table. The function argument to
$\map$ builds a string from the provided string-integer pair.
%
Here we make use of an auxiliary function,
$\intToString : \Int \to \String$, that turns an integer into a
string. The definition of this function is omitted here for brevity.
%
%
% \[
% \bl
% \wc : \UnitType \to \UnitType \eff \{\Await : \UnitType \opto \Char;\Yield : \Int \opto \UnitType\}\\
@@ -7333,6 +7410,11 @@ applies itself recursively with the original $n$.
% \el
% \]
%
We now have all the building blocks to construct a pipeline for
performing string frequency analysis on a file. The following performs
the analysis on the two first lines of Hamlet quote.
%
\[
\ba{@{~}l@{~}l}
&\bl
@@ -7343,13 +7425,14 @@ applies itself recursively with the original $n$.
\qquad\qquad\status\,(\lambda\Unit.
\ba[t]{@{}l}
\quoteHamlet~\redirect~\strlit{hamlet};\\
\Let\;cs \revto
\Let\;p \revto
\bl
(\lambda\Unit.\cat~\strlit{hamlet}) \mid (\lambda\Unit.\head~2) \mid \paste\\
\mid (\lambda\Unit.\sed\,\Record{\strlit{be,};\strlit{live}}) \mid (\lambda\Unit.\sed\,\Record{\strlit{To};\strlit{to}})\\
~~(\lambda\Unit.\cat~\strlit{hamlet}) \mid (\lambda\Unit.\head~2) \mid \paste\\
\mid (\lambda\Unit.\sed\,\Record{\strlit{be,};\strlit{be}}) \mid (\lambda\Unit.\sed\,\Record{\strlit{To};\strlit{to}})\\
\mid (\lambda\Unit.\sed\,\Record{\strlit{question:};\strlit{question}})\\
\mid \freq \mid \printTable
\el\\
\In\;(\lambda\Unit.\echo~cs)~\redirect~\strlit{analysis})})))}
\In\;(\lambda\Unit.\echo~(p\,\Unit))~\redirect~\strlit{analysis})})))}
\ea
\el \smallskip\\
\reducesto^+&
@@ -7368,8 +7451,8 @@ applies itself recursively with the original $n$.
\ba[t]{@{}l}
\Record{2;
\ba[t]{@{}l@{}l}
\texttt{"}&\texttt{to:2;live:2;or:1;not:1;\nl:2;that:1;is:1}\\
&\texttt{the:1;question::1;"}},
\texttt{"}&\texttt{to:2;be:2;or:1;not:1;\nl:2;that:1;is:1}\\
&\texttt{the:1;question:1;"}},
\ea\\
\Record{1;
\ba[t]{@{}l@{}l}
@@ -7385,35 +7468,48 @@ applies itself recursively with the original $n$.
\ea
\]
%
%
% \[
% \bl
% \grep : \String \to \UnitType \eff \{\Await : \UnitType \opto \Char;\Yield : \Char \opto \UnitType\}\\
% \grep~str \defas
% \bl
% \match\,\Record{\Do\;\Await~\Unit; str; \nil}
% \el
% \el
% \]
%
The pipeline gets bound to the variable $p$. The pipeline starts with
call to $\cat$ which streams the contents of the file
$\strlit{hamlet}$ to the process $\head$ applied to $2$, meaning it
will only forward the first two lines of the file to its
successor. The third process $\paste$ receives the first two lines one
character at a time and joins the characters into strings delimited by
whitespace. The next three instances of $\sed$ perform some string
normalisation. The first instance removes the trailing comma from the
string $\strlit{be,}$; the second normalises the capitalisation of the
word ``to''; and the third removes the trailing colon from the string
$\strlit{question:}$. The seventh process performs the frequency
analysis and outputs a table, which is being rendered as a string by
the eighth process. The output of the pipeline is supplied to the
$\echo$ utility whose output is being redirected to a file named
$\strlit{analysis}$. Contents of the file reside in location $2$ in
the data region. Here we can see that the analysis has found that the
words ``to'', ``be'', and the newline character ``$\nl$'' appear two
times each, whilst the other words appear once each.
\section{Parameterised handlers}
\label{sec:unary-parameterised-handlers}
Parameterised handlers are a useful idiom for handling stateful
computations.
Parameterised handlers are a variation of ordinary deep handlers with
an embedded functional state cell. This state cell is only accessible
locally within the handler. The use of state within the handler is
opaque to both the ambient context and the context of the computation
being handled. Semantically, parameterised handlers are defined as
folds with state threading over computation trees.
We take the deep handler calculus $\HCalc$ as our starting point and
extend it with parameterised handlers to yield the calculus $\HPCalc$.
\subsection{Syntax and static semantics}
In addition to a computation, a parameterised handler also take a
value as argument. This argument is the initial value of the state
cell embedded inside the handler.
%
\begin{figure}
\begin{syntax}
& F &::=& \cdots \mid \Record{C; A} \Rightarrow^\param D\\
& \delta &::=& \cdots \mid \param\\
& M,N &::=& \cdots \mid \ParamHandle\; M \;\With\; H^\param(W)\\
& H^\param &::=& q^A.~H
\end{syntax}
\caption{Syntax extensions for parameterised handlers.}\label{fig:param-syntax}
\end{figure}
\begin{syntax}
\slab{Handler\textrm{ }types} & F &::=& \cdots \mid \Record{C; A} \Rightarrow^\param D\\
\slab{Computations} & M,N &::=& \cdots \mid \ParamHandle\; M \;\With\; H^\param(W)\\
\slab{Parameterised\textrm{ }definitions} & H^\param &::=& q^A.~H
\end{syntax}
%
Figure~\ref{fig:param-syntax} extends the syntax of
$\HCalc$ with parameterised handlers. Syntactically a parameterised
@@ -7426,20 +7522,17 @@ definition is applied to a value $W$, which is the initial value of
the parameter $q$.
%
\begin{figure}
\begin{mathpar}
% Handle
\inferrule*[Lab=\tylab{Param\textrm{-}Handle}]
\inferrule*[Lab=\tylab{Handle^\param}]
{
% \typ{\Gamma}{V : A} \\
\typ{\Gamma}{M : C} \\
\typ{\Gamma}{W : A} \\
\typ{\Gamma}{H^\param : \Record{C; A} \Harrow D}
\typ{\Gamma}{H^\param : \Record{C; A} \Harrow^\param D}
}
{\Gamma \vdash \ParamHandle \; M \; \With\; H^\param(W) : D}
\end{mathpar}
~
\begin{mathpar}
% Parameterised handler
\inferrule*[Lab=\tylab{Handler^\param}]
{{\bl
@@ -7448,14 +7541,11 @@ the parameter $q$.
H = \{\Return\;x \mapsto M\} \uplus \{ \OpCase{\ell_i}{p_i}{r_i} \mapsto N_i \}_i
\el}\\\\
\typ{\Delta;\Gamma, q : A', x : A}{M : D}\\\\
[\typ{\Delta;\Gamma,q : A', p_i : A_i, r_i : B_i \to D}{N_i : D}]_i
[\typ{\Delta;\Gamma,q : A', p_i : A_i, r_i : \Record{B_i;A'} \to D}{N_i : D}]_i
}
{\typ{\Delta;\Gamma}{(q^{A'} . H) : \Record{C;A'} \Harrow^\param D}}
\end{mathpar}
%%
\caption{Typing rules for parameterised handlers.}\label{fig:param-static-semantics}
\end{figure}
%
We require two additional rules to type parameterised handlers. The
rules are given in Figure~\ref{fig:param-static-semantics}. The main
differences between the \tylab{Handler} and \tylab{Handler^\param} are
@@ -7495,21 +7585,62 @@ handling the return of a computation is as follows.
Both the return value $V$ and the parameter value $W$ are substituted
into the return case body $N$ for their respective binders.
\subsection{Lightweight concurrency}
\subsection{Process scheduling}
\begin{example}[Green threads]
\dhil{Example: Lightweight threads}
\end{example}
\[
\Co \defas \{\UFork : \UnitType \opto \Int; \Interrupt : \UnitType \opto \UnitType; \Wait : \Int \opto \UnitType\}
\]
% \section{Default handlers}
% \label{sec:unary-default-handlers}
\[
\bl
\dec{SchedState}~\alpha~\varepsilon \defas \Record{\List\,(\dec{Process}~\alpha~\varepsilon);\List~\alpha;\Int;\Int}\\
\dec{Process}~\alpha~\varepsilon \defas \dec{SchedState} \to \alpha \eff \varepsilon
\el
\]
% \chapter{N-ary handlers}
% \label{ch:multi-handlers}
\[
\bl
\scheduler : \Record{\alpha \eff \Co;\dec{SchedState}~\alpha~\Co} \Harrow^\param \List~\alpha\\
\scheduler \defas
\bl
\Record{ps;qs;done;pid;nextpid}.\\
\bl
\Return\;x \mapsto \Case\;qs \concat ps\;\{\\
\quad\bl
\nil \mapsto x \cons done\\
\Record{pid';qs';resume} \cons ps' \mapsto resume\,\Record{ps';qs';done;pid;nextpid}\}
\el\\
\OpCase{\UFork}{\Unit}{resume} \mapsto\\
\quad\bl
\Let\;ps' \revto \Record{nextpid;\nil;\lambda s.resume~\Record{0;s}} \cons ps\;\In\\
resume\,\Record{nextpid;\Record{ps';qs;done;pid;nextpid+1}}
\el\\
\OpCase{\Interrupt}{\Unit}{resume} \mapsto\\
\quad\bl\Let\;\Record{pid';qs';resume'} \cons ps' \revto ps \concat [\Record{pid;qs;\lambda s.resume\,\Record{\Unit;s}}]\;\In\\
resume'\,\Record{ps';qs';done;pid';nextpid}
\el\\
\OpCase{\Wait}{pid'}{resume} \mapsto\\
\quad\bl
\Let\; q \revto \Record{qs;\lambda s.resume\,\Record{\Unit;s}}\;\In\\
\Let\;\Record{qs';resume'} \revto \lookup\,\Record{pid';ps}\;\In\\
\Let\;p \revto \Record{\Record{pid;q} \cons qs;resume'}\;\In\\
\Let\;s \revto \modify\,\Record{pid';p;ps}\;\In\\
\el
\el
\el
\el
\]
% \section{Syntax and Static Semantics}
% \section{Dynamic Semantics}
% \section{Unifying deep and shallow handlers}
\[
\bl
\timesharee : (\UnitType \to \alpha \eff \Co) \to \List~\alpha\\
\timesharee~m \defas
\bl
\ParamHandle\;m\,\Unit\;\With\; \scheduler~\Record{\nil;1;2}
\el
\el
\]
\section{Related work}
\label{sec:unix-related-work}