mirror of
https://codeberg.org/la-chouette/minishell.git
synced 2025-12-06 07:28:09 +01:00
notes: add lots of notes
This commit is contained in:
parent
7a99014485
commit
2cdd540ed7
1 changed files with 320 additions and 0 deletions
320
NOTES.md
Normal file
320
NOTES.md
Normal file
|
|
@ -0,0 +1,320 @@
|
|||
# Notes relatives au projet
|
||||
|
||||
cf. [Bash Reference Manual](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html)
|
||||
|
||||
Comparative testing with bash should be done with bash --norc --posix.
|
||||
|
||||
## Shell Operation
|
||||
|
||||
cf. 3.1.1 Shell Operation
|
||||
|
||||
Breaks the input into **words** and **operators**, obeying the *Quoting Rules*.
|
||||
These tokens are delimited by **metacharacters**.
|
||||
|
||||
Parses the tokens into simple and compound commands (see *Shell Commands*)
|
||||
|
||||
Performs the various _Shell Expansions_, breaking the expanded tokens into lists
|
||||
of filenames and commands and arguments.
|
||||
|
||||
### Quoting Rules
|
||||
|
||||
cf. 3.1.2 Quoting
|
||||
|
||||
Quoting escapes metacharacters.
|
||||
|
||||
The quoting mechanisms we have to implement are:
|
||||
|
||||
cf. Subject
|
||||
|
||||
* Single quotes, which prevent metacharacters interpretation.
|
||||
* Double quotes, which prevent metacharacters interpretation except for '$' (See
|
||||
_Shell Parameter Expansion_).
|
||||
|
||||
In the Bash Reference Manual, these are defined as follows (keeping only the parts we have to implement):
|
||||
|
||||
cf. 3.1.2.2 Single Quotes
|
||||
|
||||
Preserves the literal value of each character within the quotes.
|
||||
|
||||
cf. 3.1.2.3 Double Quotes
|
||||
|
||||
Preserves the literal value of all characters within the quotes, with the exception of '$'.
|
||||
|
||||
TODO: The special parameters ‘*’ and ‘@’ have special meaning when in double quotes (see Shell Parameter Expansion).
|
||||
See if we have to handle this
|
||||
|
||||
Per the subject: minishell should not interpret unclosed quotes
|
||||
|
||||
### Shell Commands
|
||||
|
||||
cf. 3.2 Shell Commands
|
||||
|
||||
A Shell Command may be either a *Simple Command*, a *Pipeline*, a *List of
|
||||
Commands* (composed of one or more *Pipelines*), or a *Grouped Command*
|
||||
(composed of one or more *List of Commands*).
|
||||
|
||||
#### Simple Commands
|
||||
|
||||
cf. 3.2.2 Simple Commands
|
||||
|
||||
It’s just a sequence of words separated by **blanks**, terminated by one of the
|
||||
shell’s **control operators**.
|
||||
The first **word** specifies a command to be executed, with the rest of the
|
||||
**words** being that command’s arguments.
|
||||
|
||||
The return status (see _Exit Status_) of a simple command is its exit status as
|
||||
provided by the POSIX 1003.1 waitpid function, or 128+n if the command was
|
||||
terminated by signal n.
|
||||
|
||||
#### Pipelines
|
||||
|
||||
cf. 3.2.3 Pipelines
|
||||
|
||||
A pipeline is a sequence of one or more commands separated by the control operator '|'.
|
||||
|
||||
The output of each command in the pipeline is connected via a pipe to the input
|
||||
of the next command.
|
||||
That is, each command reads the previous command’s output.
|
||||
This connection is performed before any redirections specified by the first
|
||||
command.
|
||||
|
||||
The shell waits for all commands in the pipeline to complete before reading the next command.
|
||||
|
||||
Each command in a multi-command pipeline, where pipes are created, is executed
|
||||
in its own _subshell_, which is a separate process.
|
||||
|
||||
e.g.
|
||||
|
||||
```shell
|
||||
export TT=1 | echo $TT
|
||||
```
|
||||
|
||||
prints an empty string, because TT is unset in the second subshell.
|
||||
|
||||
The exit status of a pipeline is the exit status of the last command in the pipeline.
|
||||
|
||||
The shell waits for all commands in the pipeline to terminate before returning a value.
|
||||
|
||||
#### Lists of Commands
|
||||
|
||||
cf. 3.2.4 Lists of Commands
|
||||
|
||||
A list is a sequence of one or more pipelines separated by one of the
|
||||
**operators** ‘&&’, or ‘||’, and optionally terminated by a newline.
|
||||
|
||||
AND and OR lists are sequences of one or more pipelines separated by the control
|
||||
operators ‘&&’ and ‘||’, respectively.
|
||||
AND and OR lists are executed with left associativity.
|
||||
|
||||
e.g.
|
||||
|
||||
```shell
|
||||
A && B && C
|
||||
```
|
||||
|
||||
is the same as
|
||||
|
||||
```shell
|
||||
(A && B) && C
|
||||
```
|
||||
|
||||
An AND list has the form
|
||||
|
||||
```shell
|
||||
A && B
|
||||
```
|
||||
|
||||
B is execute if and only if A has an exit status of 0 (succes).
|
||||
|
||||
An OR list has the form
|
||||
|
||||
```shell
|
||||
A || B
|
||||
```
|
||||
|
||||
B is execute if and only if A has a non-zero exit status (failure).
|
||||
|
||||
The return status of AND and OR lists is the exit status of the last command
|
||||
executed in the list.
|
||||
|
||||
#### Group of Commands
|
||||
|
||||
cf. 3.2.5 Compound Commands
|
||||
|
||||
Each group begins with the **control operator** '(' and ends with the
|
||||
**control operator** ')'.
|
||||
|
||||
Any redirections associated with a _group of commands_ apply to all commands
|
||||
within that _group of commands_ unless explicitly overridden.
|
||||
|
||||
cf. 3.2.5.3 Grouping Commands
|
||||
|
||||
When commands are grouped, redirections may be applied to the entire command
|
||||
list. For example, the output of all the commands in the list may be redirected
|
||||
to a single stream.
|
||||
|
||||
( LIST )
|
||||
|
||||
The parentheses are operators, and are recognized as separate tokens by the
|
||||
shell even if they are not separated from the LIST by whitespace.
|
||||
|
||||
Placing a list of commands between parentheses forces the shell to create a
|
||||
_subshell, and each of the commands in LIST is executed in that subshell
|
||||
environment. Since the LIST is executed in a subshell, variable assignments do
|
||||
not remain in effect after the subshell completes.
|
||||
|
||||
The exit status of this construct is the exit status of LIST.
|
||||
|
||||
### Shell Expansion
|
||||
|
||||
cf. 3.5 Shell Expansions
|
||||
|
||||
Expansion is performed on the command line after it has been split into
|
||||
**token**'s. There are seven kinds of expansion performed. in the following
|
||||
order:
|
||||
|
||||
* brace expansion
|
||||
* tilde expansion
|
||||
* parameter and variable expansion
|
||||
* arithmetic expansion
|
||||
* command substitution (left to right)
|
||||
* word splitting
|
||||
* filename expansion
|
||||
|
||||
We only have to implement the following kinds:
|
||||
* parameter expansion
|
||||
* word splitting
|
||||
* filename expansion
|
||||
|
||||
After these expansions are performed, quote characters present in the original
|
||||
word are removed unless they have been quoted themselves ("_quote removal_").
|
||||
|
||||
Only brace expansion, word splitting, and filename expansion can increase the
|
||||
number of words of the expansion; other expansions expand a single word to a
|
||||
single word.
|
||||
|
||||
#### Shell Parameter Expansion
|
||||
cf. 3.5.3 Shell Parameter Expansion
|
||||
|
||||
The '$' character introduces parameter expansion, command substitution, or
|
||||
arithmetic expansion.
|
||||
|
||||
The form is $VAR, where VAR may only contain the following characters:
|
||||
* a-z
|
||||
* A-Z
|
||||
* _
|
||||
* 0-9 (not in the first character)
|
||||
|
||||
#### Word Splitting
|
||||
cf. 3.5.7 Word Splitting
|
||||
|
||||
The shell scans the results of parameter expansion that did not occur within
|
||||
double quotes for word splitting.
|
||||
|
||||
The shell splits the results of the other expansions into **words**.
|
||||
|
||||
The shell treats the following characters as a delimiter:
|
||||
* (space)
|
||||
* (tab)
|
||||
* (newline)
|
||||
|
||||
Explicit null arguments ('""' or '''') are retained and passed to commands as
|
||||
empty strings. Unquoted implicit null arguments, resulting from the expansion
|
||||
of parameters that have no values, are removed. If a parameter with no value is
|
||||
expanded within double quotes, a null argument results and is retained and
|
||||
passed to a command as an empty string. When a quoted null argument appears as
|
||||
part of a word whose expansion is non-null, the null argument is removed. That
|
||||
is, the word '-d''' becomes '-d' after word splitting and null argument removal.
|
||||
|
||||
Note that if no expansion occurs, no splitting is performed.
|
||||
|
||||
#### Filename Expansion
|
||||
cf. 3.5.8 Filename Expansion
|
||||
|
||||
Bash scans each word for the character '\*'.
|
||||
|
||||
If one of these characters appears, and is not quoted, then the word is regarded
|
||||
as a PATTERN, and replaced with an alphabetically sorted list of filenames
|
||||
matching the pattern (see: _Pattern Matching_). If no matching filenames are
|
||||
found, the word is left unchanged.
|
||||
|
||||
When a pattern is used for filename expansion, the character '.' at the start of
|
||||
a filename or immediately following a slash must be matched explicitly. In order
|
||||
to match the filenames '.' and '..', the pattern must begin with '.'
|
||||
|
||||
|
||||
|
||||
#### Quote Removal
|
||||
TODO
|
||||
|
||||
## Subshell
|
||||
|
||||
cf. 3.7.3 Command Execution Environment
|
||||
|
||||
The shell has an execution environment, which consists of the following:
|
||||
|
||||
open files inherited by the shell at invocation, as modified by redirections
|
||||
|
||||
the current working directory as set by cd or inherited by the shell at invocation
|
||||
|
||||
shell variables, passed in the environment
|
||||
|
||||
A command invoked in this separate environment cannot affect the shell’s
|
||||
execution environment.
|
||||
|
||||
A subshell is a copy of the shell process.
|
||||
|
||||
## Here Documents
|
||||
cf. Bash Reference Manual 3.6.6 Here Documents
|
||||
|
||||
This type of redirection instructs the shell to read input from the current
|
||||
source until a line containing only word (with no trailing blanks) is seen. All
|
||||
of the lines read up to that point are then used as the standard input for a
|
||||
command.
|
||||
|
||||
TODO: The following paragraph may not apply fully to our project, check it again!
|
||||
|
||||
No parameter and variable expansion, command substitution, arithmetic expansion,
|
||||
or filename expansion is performed on word. If any part of word is quoted, the
|
||||
delimiter is the result of quote removal on word, and the lines in the
|
||||
here-document are not expanded. If word is unquoted, all lines of the
|
||||
here-document are subjected to parameter expansion, command substitution, and
|
||||
arithmetic expansion, the character sequence \newline is ignored, and ‘\’ must
|
||||
be used to quote the characters ‘\’, ‘$’, and ‘`’.
|
||||
|
||||
## Definitions
|
||||
cf. [Bash Reference Manual](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Definitions)
|
||||
cf. 2 Definitions
|
||||
|
||||
**token**
|
||||
A sequence of characters considered a single unit by the shell. It is either a
|
||||
word or an operator
|
||||
|
||||
**word**
|
||||
A sequence of characters treated as a unit by the shell. Words may not include
|
||||
unquoted metacharacters.
|
||||
|
||||
**operator**
|
||||
A **control operator** or a **redirection operator**.
|
||||
Operators contain at least one unquoted **metacharacter**.
|
||||
|
||||
**control operator**
|
||||
A token that performs a control function.
|
||||
|
||||
It is a newline or one of the following: '|', ‘||’, ‘&&’, ‘(’, or ‘)’.
|
||||
|
||||
**redirection operator**
|
||||
For our project:
|
||||
|
||||
'<' redirects input
|
||||
|
||||
'>' redirects output
|
||||
|
||||
'<<' is here_doc with delimiter.
|
||||
delimiter is a **word**.
|
||||
Does not have to update history
|
||||
|
||||
'>>' redirects output in append mode
|
||||
|
||||
**blank**
|
||||
A space or tab character
|
||||
Loading…
Add table
Add a link
Reference in a new issue