mirror of
https://codeberg.org/la-chouette/minishell.git
synced 2025-12-06 07:28:09 +01:00
321 lines
9.6 KiB
Markdown
321 lines
9.6 KiB
Markdown
|
|
# Notes relatives au projet
|
|||
|
|
|
|||
|
|
cf. [Bash Reference Manual](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html)
|
|||
|
|
|
|||
|
|
Comparative testing with bash should be done with bash --norc --posix.
|
|||
|
|
|
|||
|
|
## Shell Operation
|
|||
|
|
|
|||
|
|
cf. 3.1.1 Shell Operation
|
|||
|
|
|
|||
|
|
Breaks the input into **words** and **operators**, obeying the *Quoting Rules*.
|
|||
|
|
These tokens are delimited by **metacharacters**.
|
|||
|
|
|
|||
|
|
Parses the tokens into simple and compound commands (see *Shell Commands*)
|
|||
|
|
|
|||
|
|
Performs the various _Shell Expansions_, breaking the expanded tokens into lists
|
|||
|
|
of filenames and commands and arguments.
|
|||
|
|
|
|||
|
|
### Quoting Rules
|
|||
|
|
|
|||
|
|
cf. 3.1.2 Quoting
|
|||
|
|
|
|||
|
|
Quoting escapes metacharacters.
|
|||
|
|
|
|||
|
|
The quoting mechanisms we have to implement are:
|
|||
|
|
|
|||
|
|
cf. Subject
|
|||
|
|
|
|||
|
|
* Single quotes, which prevent metacharacters interpretation.
|
|||
|
|
* Double quotes, which prevent metacharacters interpretation except for '$' (See
|
|||
|
|
_Shell Parameter Expansion_).
|
|||
|
|
|
|||
|
|
In the Bash Reference Manual, these are defined as follows (keeping only the parts we have to implement):
|
|||
|
|
|
|||
|
|
cf. 3.1.2.2 Single Quotes
|
|||
|
|
|
|||
|
|
Preserves the literal value of each character within the quotes.
|
|||
|
|
|
|||
|
|
cf. 3.1.2.3 Double Quotes
|
|||
|
|
|
|||
|
|
Preserves the literal value of all characters within the quotes, with the exception of '$'.
|
|||
|
|
|
|||
|
|
TODO: The special parameters ‘*’ and ‘@’ have special meaning when in double quotes (see Shell Parameter Expansion).
|
|||
|
|
See if we have to handle this
|
|||
|
|
|
|||
|
|
Per the subject: minishell should not interpret unclosed quotes
|
|||
|
|
|
|||
|
|
### Shell Commands
|
|||
|
|
|
|||
|
|
cf. 3.2 Shell Commands
|
|||
|
|
|
|||
|
|
A Shell Command may be either a *Simple Command*, a *Pipeline*, a *List of
|
|||
|
|
Commands* (composed of one or more *Pipelines*), or a *Grouped Command*
|
|||
|
|
(composed of one or more *List of Commands*).
|
|||
|
|
|
|||
|
|
#### Simple Commands
|
|||
|
|
|
|||
|
|
cf. 3.2.2 Simple Commands
|
|||
|
|
|
|||
|
|
It’s just a sequence of words separated by **blanks**, terminated by one of the
|
|||
|
|
shell’s **control operators**.
|
|||
|
|
The first **word** specifies a command to be executed, with the rest of the
|
|||
|
|
**words** being that command’s arguments.
|
|||
|
|
|
|||
|
|
The return status (see _Exit Status_) of a simple command is its exit status as
|
|||
|
|
provided by the POSIX 1003.1 waitpid function, or 128+n if the command was
|
|||
|
|
terminated by signal n.
|
|||
|
|
|
|||
|
|
#### Pipelines
|
|||
|
|
|
|||
|
|
cf. 3.2.3 Pipelines
|
|||
|
|
|
|||
|
|
A pipeline is a sequence of one or more commands separated by the control operator '|'.
|
|||
|
|
|
|||
|
|
The output of each command in the pipeline is connected via a pipe to the input
|
|||
|
|
of the next command.
|
|||
|
|
That is, each command reads the previous command’s output.
|
|||
|
|
This connection is performed before any redirections specified by the first
|
|||
|
|
command.
|
|||
|
|
|
|||
|
|
The shell waits for all commands in the pipeline to complete before reading the next command.
|
|||
|
|
|
|||
|
|
Each command in a multi-command pipeline, where pipes are created, is executed
|
|||
|
|
in its own _subshell_, which is a separate process.
|
|||
|
|
|
|||
|
|
e.g.
|
|||
|
|
|
|||
|
|
```shell
|
|||
|
|
export TT=1 | echo $TT
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
prints an empty string, because TT is unset in the second subshell.
|
|||
|
|
|
|||
|
|
The exit status of a pipeline is the exit status of the last command in the pipeline.
|
|||
|
|
|
|||
|
|
The shell waits for all commands in the pipeline to terminate before returning a value.
|
|||
|
|
|
|||
|
|
#### Lists of Commands
|
|||
|
|
|
|||
|
|
cf. 3.2.4 Lists of Commands
|
|||
|
|
|
|||
|
|
A list is a sequence of one or more pipelines separated by one of the
|
|||
|
|
**operators** ‘&&’, or ‘||’, and optionally terminated by a newline.
|
|||
|
|
|
|||
|
|
AND and OR lists are sequences of one or more pipelines separated by the control
|
|||
|
|
operators ‘&&’ and ‘||’, respectively.
|
|||
|
|
AND and OR lists are executed with left associativity.
|
|||
|
|
|
|||
|
|
e.g.
|
|||
|
|
|
|||
|
|
```shell
|
|||
|
|
A && B && C
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
is the same as
|
|||
|
|
|
|||
|
|
```shell
|
|||
|
|
(A && B) && C
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
An AND list has the form
|
|||
|
|
|
|||
|
|
```shell
|
|||
|
|
A && B
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
B is execute if and only if A has an exit status of 0 (succes).
|
|||
|
|
|
|||
|
|
An OR list has the form
|
|||
|
|
|
|||
|
|
```shell
|
|||
|
|
A || B
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
B is execute if and only if A has a non-zero exit status (failure).
|
|||
|
|
|
|||
|
|
The return status of AND and OR lists is the exit status of the last command
|
|||
|
|
executed in the list.
|
|||
|
|
|
|||
|
|
#### Group of Commands
|
|||
|
|
|
|||
|
|
cf. 3.2.5 Compound Commands
|
|||
|
|
|
|||
|
|
Each group begins with the **control operator** '(' and ends with the
|
|||
|
|
**control operator** ')'.
|
|||
|
|
|
|||
|
|
Any redirections associated with a _group of commands_ apply to all commands
|
|||
|
|
within that _group of commands_ unless explicitly overridden.
|
|||
|
|
|
|||
|
|
cf. 3.2.5.3 Grouping Commands
|
|||
|
|
|
|||
|
|
When commands are grouped, redirections may be applied to the entire command
|
|||
|
|
list. For example, the output of all the commands in the list may be redirected
|
|||
|
|
to a single stream.
|
|||
|
|
|
|||
|
|
( LIST )
|
|||
|
|
|
|||
|
|
The parentheses are operators, and are recognized as separate tokens by the
|
|||
|
|
shell even if they are not separated from the LIST by whitespace.
|
|||
|
|
|
|||
|
|
Placing a list of commands between parentheses forces the shell to create a
|
|||
|
|
_subshell, and each of the commands in LIST is executed in that subshell
|
|||
|
|
environment. Since the LIST is executed in a subshell, variable assignments do
|
|||
|
|
not remain in effect after the subshell completes.
|
|||
|
|
|
|||
|
|
The exit status of this construct is the exit status of LIST.
|
|||
|
|
|
|||
|
|
### Shell Expansion
|
|||
|
|
|
|||
|
|
cf. 3.5 Shell Expansions
|
|||
|
|
|
|||
|
|
Expansion is performed on the command line after it has been split into
|
|||
|
|
**token**'s. There are seven kinds of expansion performed. in the following
|
|||
|
|
order:
|
|||
|
|
|
|||
|
|
* brace expansion
|
|||
|
|
* tilde expansion
|
|||
|
|
* parameter and variable expansion
|
|||
|
|
* arithmetic expansion
|
|||
|
|
* command substitution (left to right)
|
|||
|
|
* word splitting
|
|||
|
|
* filename expansion
|
|||
|
|
|
|||
|
|
We only have to implement the following kinds:
|
|||
|
|
* parameter expansion
|
|||
|
|
* word splitting
|
|||
|
|
* filename expansion
|
|||
|
|
|
|||
|
|
After these expansions are performed, quote characters present in the original
|
|||
|
|
word are removed unless they have been quoted themselves ("_quote removal_").
|
|||
|
|
|
|||
|
|
Only brace expansion, word splitting, and filename expansion can increase the
|
|||
|
|
number of words of the expansion; other expansions expand a single word to a
|
|||
|
|
single word.
|
|||
|
|
|
|||
|
|
#### Shell Parameter Expansion
|
|||
|
|
cf. 3.5.3 Shell Parameter Expansion
|
|||
|
|
|
|||
|
|
The '$' character introduces parameter expansion, command substitution, or
|
|||
|
|
arithmetic expansion.
|
|||
|
|
|
|||
|
|
The form is $VAR, where VAR may only contain the following characters:
|
|||
|
|
* a-z
|
|||
|
|
* A-Z
|
|||
|
|
* _
|
|||
|
|
* 0-9 (not in the first character)
|
|||
|
|
|
|||
|
|
#### Word Splitting
|
|||
|
|
cf. 3.5.7 Word Splitting
|
|||
|
|
|
|||
|
|
The shell scans the results of parameter expansion that did not occur within
|
|||
|
|
double quotes for word splitting.
|
|||
|
|
|
|||
|
|
The shell splits the results of the other expansions into **words**.
|
|||
|
|
|
|||
|
|
The shell treats the following characters as a delimiter:
|
|||
|
|
* (space)
|
|||
|
|
* (tab)
|
|||
|
|
* (newline)
|
|||
|
|
|
|||
|
|
Explicit null arguments ('""' or '''') are retained and passed to commands as
|
|||
|
|
empty strings. Unquoted implicit null arguments, resulting from the expansion
|
|||
|
|
of parameters that have no values, are removed. If a parameter with no value is
|
|||
|
|
expanded within double quotes, a null argument results and is retained and
|
|||
|
|
passed to a command as an empty string. When a quoted null argument appears as
|
|||
|
|
part of a word whose expansion is non-null, the null argument is removed. That
|
|||
|
|
is, the word '-d''' becomes '-d' after word splitting and null argument removal.
|
|||
|
|
|
|||
|
|
Note that if no expansion occurs, no splitting is performed.
|
|||
|
|
|
|||
|
|
#### Filename Expansion
|
|||
|
|
cf. 3.5.8 Filename Expansion
|
|||
|
|
|
|||
|
|
Bash scans each word for the character '\*'.
|
|||
|
|
|
|||
|
|
If one of these characters appears, and is not quoted, then the word is regarded
|
|||
|
|
as a PATTERN, and replaced with an alphabetically sorted list of filenames
|
|||
|
|
matching the pattern (see: _Pattern Matching_). If no matching filenames are
|
|||
|
|
found, the word is left unchanged.
|
|||
|
|
|
|||
|
|
When a pattern is used for filename expansion, the character '.' at the start of
|
|||
|
|
a filename or immediately following a slash must be matched explicitly. In order
|
|||
|
|
to match the filenames '.' and '..', the pattern must begin with '.'
|
|||
|
|
|
|||
|
|
|
|||
|
|
|
|||
|
|
#### Quote Removal
|
|||
|
|
TODO
|
|||
|
|
|
|||
|
|
## Subshell
|
|||
|
|
|
|||
|
|
cf. 3.7.3 Command Execution Environment
|
|||
|
|
|
|||
|
|
The shell has an execution environment, which consists of the following:
|
|||
|
|
|
|||
|
|
open files inherited by the shell at invocation, as modified by redirections
|
|||
|
|
|
|||
|
|
the current working directory as set by cd or inherited by the shell at invocation
|
|||
|
|
|
|||
|
|
shell variables, passed in the environment
|
|||
|
|
|
|||
|
|
A command invoked in this separate environment cannot affect the shell’s
|
|||
|
|
execution environment.
|
|||
|
|
|
|||
|
|
A subshell is a copy of the shell process.
|
|||
|
|
|
|||
|
|
## Here Documents
|
|||
|
|
cf. Bash Reference Manual 3.6.6 Here Documents
|
|||
|
|
|
|||
|
|
This type of redirection instructs the shell to read input from the current
|
|||
|
|
source until a line containing only word (with no trailing blanks) is seen. All
|
|||
|
|
of the lines read up to that point are then used as the standard input for a
|
|||
|
|
command.
|
|||
|
|
|
|||
|
|
TODO: The following paragraph may not apply fully to our project, check it again!
|
|||
|
|
|
|||
|
|
No parameter and variable expansion, command substitution, arithmetic expansion,
|
|||
|
|
or filename expansion is performed on word. If any part of word is quoted, the
|
|||
|
|
delimiter is the result of quote removal on word, and the lines in the
|
|||
|
|
here-document are not expanded. If word is unquoted, all lines of the
|
|||
|
|
here-document are subjected to parameter expansion, command substitution, and
|
|||
|
|
arithmetic expansion, the character sequence \newline is ignored, and ‘\’ must
|
|||
|
|
be used to quote the characters ‘\’, ‘$’, and ‘`’.
|
|||
|
|
|
|||
|
|
## Definitions
|
|||
|
|
cf. [Bash Reference Manual](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Definitions)
|
|||
|
|
cf. 2 Definitions
|
|||
|
|
|
|||
|
|
**token**
|
|||
|
|
A sequence of characters considered a single unit by the shell. It is either a
|
|||
|
|
word or an operator
|
|||
|
|
|
|||
|
|
**word**
|
|||
|
|
A sequence of characters treated as a unit by the shell. Words may not include
|
|||
|
|
unquoted metacharacters.
|
|||
|
|
|
|||
|
|
**operator**
|
|||
|
|
A **control operator** or a **redirection operator**.
|
|||
|
|
Operators contain at least one unquoted **metacharacter**.
|
|||
|
|
|
|||
|
|
**control operator**
|
|||
|
|
A token that performs a control function.
|
|||
|
|
|
|||
|
|
It is a newline or one of the following: '|', ‘||’, ‘&&’, ‘(’, or ‘)’.
|
|||
|
|
|
|||
|
|
**redirection operator**
|
|||
|
|
For our project:
|
|||
|
|
|
|||
|
|
'<' redirects input
|
|||
|
|
|
|||
|
|
'>' redirects output
|
|||
|
|
|
|||
|
|
'<<' is here_doc with delimiter.
|
|||
|
|
delimiter is a **word**.
|
|||
|
|
Does not have to update history
|
|||
|
|
|
|||
|
|
'>>' redirects output in append mode
|
|||
|
|
|
|||
|
|
**blank**
|
|||
|
|
A space or tab character
|