Since writing the post on a hypothetical hull language as an alternative to shell I cannot stop thinking about the shortcomings of shell.
And one think that comes to mind over and over is type-safeness. Shell treats everything as a string and that's the source of both its power and its poor maintainability.
So when I ask whether shell can be improved, the question is actually more subtle: Can it be improved without compromising its versatility? Can we, for example, be more type-safe without having to type Java-like stuff on the command line? Without sacrificing the powerful and dangerous features like string expansion?
I mean, you can write shell-like scripts in Python even today and use type hints to get type safeness. But in real world this practice seems to be restricted to writing more complex programs, programs that require actual in-language processing, complex control flow, use of libraries and so on. Your typical shell script which just chains together a handful of UNIX utilities — no, I don't see that happening a lot.
To put it in other words, different "scripting languages" managed to carve their own problem spaces from what once used to be the domain of shell, but almost none of them attacked its very core use case, the place where it acts as a dumb glue between stand-alone applications.
But when writing shell scripts, I observe that I do have a type system in mind. When I type "ls" I know that an argument of type "path" should follow. Sometimes I am even explicit about it. When I save JSON into a file, I name it "foo.json". But none of that is formalized in the language.
And in some way, albeit in a very hacky one, shell is to some extent aware of the types. When I type "ls" and press Tab twice a list of files appears on the screen. When I type "git checkout" pressing Tab twice results in a list of git branches. So, in a way, shell "knows" what kind of argument is expected.
And the question that's bugging me is whether the same can be done in a more systemic way.
Maybe it's possible to have a shell-like language with actual type system. Maybe it could know that file with .json extension is supposed to contain JSON. Or it could know that "jq" expects JSON as an input. Maybe it could know that JSON is a kind of text file and that any program accepting a text file (e.g. grep) can therefore accept JSON as well. And it could know that "ls -l" returns a specific "type", a refinement of "text file" and "file with one item per line", with items like access rights, ownership, file size and so on.
But how would one do that?
In addition to the language implementing a type system it would require some kind of annotation of common UNIX utilities, adding formal specification of their arguments and outputs. (With all programs not present in the database defaulting to "any number of arguments of any type and any output".) Maybe it can be done by simple type-safe wrappers on top of existing non-type-safe binaries.
There seems to be a lot of food for thought here.
May 12th, 2019
It can be done, but sadly that way lies PowerShell
@Tom Package-deal fallacy.
How about powershell?
You may be interested in the Oil Shell project. The fact that arrays and strings are poorly distinguished was a major motivation for the project as I understand it. http://www.oilshell.org/
Just imagine every command or tool you call within your script would accept and return/output JSON. That would reduce many common errors.
The command line syntax (including the types and expectations) should be inspectable, without execution or participation of the tool itself. Embedded in the dwarf info, or equivalent?
I was thinking along the same lines. Some kind of database containing metadata about the executables: the arguments, the input format, the output format etc.
I come up with similar idea about web services reflection. Why tool like poster (or even like curl) cannot ask web service its interface like example.com/api and get schema both machine and human readable. Then build UI so that I interact with ease without "400 bad request"?
One important thing would be to ensure it does not lie. May be schema should be used as source for generation of API stubs and test suite (just dry run without invoking actual handler, ensuring it does not return 400/404).
Also: studying the shell completion languages (zsh, bash) would probably be useful here.
What is the difference between a shell command and a function in a program?
The shell command has a text ui and a function does not. If we were to automate the creation of a ui for functions, then functions could take the role of commands. And an ui does not have to be text based.
It's a tempting idea. Funnyways, strings are often enough. When you ps and grep PID it is an integer but who cares ? Would you need to handle it as an integer ? Only in cases you care about certain property of ints e.g. comparison.
What I personally miss in bash are more decent datastructures — there are arrays and assoc arrays — but nothing beyond. I'd guess decent parsing of numbers in shell with actual error handling would be enough for me, also some assert e.g. 'this must be number' — I would not need full type system and type checking in shell.
As for IPC, I'd hate json as protocol, because I hate json. I'd even be in favor of binary protocol, but IMHO anything with a fix structure it a trap. Text is retardedly simple to parse/extract, anything with sctructure becomes pain over time
You know, I never really thought about it, but I do wonder if someone could encode a dependent-type-system into a shell. Languages like Idris have a REPL, indicating that they *can* run in interpreted mode, so conceivably you could have some kind of typesafe pipe that handles it.
Awhile I took a stab at making a Lisp-based command line shell, which was going to play nicely with JSON and Protobufs by default, but my computer crashed, I got a new computer, and forgot that I had started that project until right now. I should start that again.
Might be worth glancing at TypeScript's type definitions. By using .d files written in a a definition language, type safety can be added on top of JavaScript. Good ones are taken up into a repository of officially supported typings. With a similar approach, type information could, at least to some degree, be added to command line tools.
Post preview:
Close preview