Orto – more than just a command line parser

The problem

In principle, command line parameters of a program work the same way as parameters of a subprogram, but there is a huge difference in the amount of work required of the programmer. In a subprogram you declare the parameters with name and type, and to use the value of a parameter you just write its name. The compiler does the rest. Command line parameters, on the other hand, are just a list of strings. Before the program can start its work it has to analyze the command line carefully. It has to check that all the required parameters are present, that there aren't any unrecognized parameters, and that there aren't multiple instances of parameters that are only meaningful to give once. It must interpret the parameters as values of different types and check that the values are within their constraints. If anything is wrong it has to print an informative error message.

Writing code to do all of this in each program can be quite tedious, and it is tempting to do it the easiest way possible. This easily makes the command line syntax unnecessarily strict or difficult to learn, making the program harder to use.

Orto to the rescue

Orto takes care of most of this work and makes command line parameters almost as easy to work with as parameters of subprograms. You declare your command line parameters with name and type, specify whether they are mandatory or optional and whether they may occur more than once, and maybe provide a default value. Orto will then analyze the command line, identify the parameters and interpret them according to their specified types. It checks that the command line is in every way correct, and prints error messages if any errors are found. If all is correct you can then retrieve the value of any parameter with a simple function call, and it is delivered as a value of its specified type, ready to use without further translation.

For numeric parameters you can define a unit that the value is measured in. Orto will then understand and handle the standard unit prefixes. If the unit is "m" for meters, the user may type "5km" or "98cm", and Orto will recognize the prefix and multiply the value by the corresponding factor.

Orto also prints help texts and version numbers. The same parameter definitions that rule how Orto interprets the parameters are also used to generate the help text, so there is no risk of discrepancies between the help text and how the program actually interprets the command line.

Orto is therefore far more than a command line parser; it's a complete command line parameter handler.

Orto's command line syntax

Programs that use Orto have a common command line syntax. A design principle has been that users shouldn't have to remember more details than necessary about differences in command line syntax between different programs. To that end, Orto supports Gnu style parameters with two hyphens before the name, as well as the forms with only one hyphen or no hyphen at all that some programs use. Users also don't have to remember whether to type "new_name" or "new-name", or whether to use a comma or a period for decimal sign.

The syntax is as follows:

  • Parameters are written in "name=value" form.
  • Parameters may be given in any order.
  • In parameter names and in values of enumeration parameters, upper- and lower-case letters are equivalent, and hyphens are equivalent to underscores.
  • Spaces and underscores are allowed between the digits of numerical values, and for real types a decimal comma or a decimal point may be used. (Values with spaces in them must of course be quoted when entered on a command line, as command shells normally interpret a space as a separator between parameters.)
  • Parameter names may optionally be preceded by one or two hyphens.

Expressed in a form of EBNF, the syntax is:
<program_name> { <whitespace> ["-" | "--"] <parameter_name> "=" <value> }

How to use Orto

This section will use an example program to show step by step how to use Orto.

Defining parameters

AdaCL.Command_Line.Orto is a generic package which takes a discrete type as a parameter. It has three generic child packages that each takes a data type as a parameter: Discrete_Parameters, Signed_Integer_Parameters and Floating_Point_Parameters. Discrete_Parameters is mainly intended for enumerations. The program will have parameters of both floating point, integer and enumeration types, so all three generic child packages are needed. Orto uses EAstrings for all string handling, so we also need EAstrings.Latin_1 to convert string literals to EAstrings.

 with AdaCL.Command_Line.Orto.Discrete_Parameters;
 with AdaCL.Command_Line.Orto.Signed_Integer_Parameters;
 with AdaCL.Command_Line.Orto.Floating_Point_Parameters;
 with AdaCL.EAstrings.Latin_1; use AdaCL.EAstrings.Latin_1;

First declare an enumeration type that lists all the parameters' names, and instantiate Orto with this type. This instantiation must be done in a package (because Orto uses generic containers which in turn use controlled types internally, and controlled types must be declared at the library level).

 package Nutrimatic_Parameters is

    type Parameters is (Tea, Volume, Lumps_Of_Sugar, Sugar, Milk, No_Milk,
                        Who, Customer, Client, Greeting);
    package Handler is new AdaCL.Command_Line.Orto (Parameters);
    use Handler;

Then make one instance of Discrete_Parameters, Signed_Integer_Parameters or Floating_Point_Parameters for each type that some parameter will have, except for string types and Boolean.

    type Teas is (Blackcurrant, Ceylon, Earl_Grey, English_Breakfast, Green,
    type Litres is digits 4 range 0.0 .. 5.0;
    package Tea_Param is new Handler.Discrete_Parameters (Teas);
    package Natural_Param is new Handler.Signed_Integer_Parameters (Natural);
    package Volume_Param is new Handler.Floating_Point_Parameters (Litres);

Orto also has a generic subpackage, Analyze. The parameter to Analyze is an array whose index type is the type that Orto was instantiated with – that is, the enumeration of parameter names. This array should be filled with parameter definitions, which are created by the definition functions. By choosing which function to call, you choose both the type of a parameter and how many values it can have. Here, the parameter Tea will have the type Teas. It can only have one value so we call Definition_Single in the newly created package Tea_Param. Each definition function requires a description, an EAstring that will be printed in the help text. A default value for the parameter can also be specified. If there is a default value, the parameter is optional. Otherwise the parameter is mandatory and an error message will be printed if it isn't found on the command line.

    package Analyzer is new Analyze
      ((Tea            => Tea_Param.Definition_Single
          (Description  => +"The taste of dried leaves boiled in water.",
           Default      => Ceylon),

The parameter Volume will have the type Litres, which is handled by the package Volume_Param. Volume can also only have one value so we call Volume_Param.Definition_Single. We also define a unit for this parameter, and thereby turn the prefix handling on.

        Volume         => Volume_Param.Definition_Single
          (Description  => +"How much tea you want.",
           Unit         => +"l",
           Default      => 0.165),

The type of Lumps_Of_Sugar will be Natural. Its name is a little long so we define a shorter synonym, Sugar. Users may then provide the parameter as either "sugar" or "lumps_of_sugar".

        Lumps_Of_Sugar => Natural_Param.Definition_Single
          (Description  => +"How many lumps of sugar you want.",
           Default      => 0),
        Sugar          => Synonym (Lumps_Of_Sugar),

Milk will be Boolean. No generic instantiation is needed as types and functions for boolean parameters are provided in Orto itself. Boolean parameters have some special properties. Firstly, the values "true", "yes" and "on" are all recognized as True, and "false", "no" and "off" are False. Secondly, if Accept_Omitted_Value is set to True, the parameter may be given without a value. Its value is then assumed to be True. Note that this is not the same as the default value. The default value, if there is one, is used when the parameter isn't given at all.

The third special property of booleans is that they may have antonyms. Like a synonym, an antonym is another name for the same parameter, but it negates the value. If a parameter that has the value True is referenced by one of its antonyms, False is returned. Likewise, if an antonym is given on the command line with the value True, and the program requests the value of a synonym, False is returned.

Here we define No_Milk as an antonym for Milk. The value may be omitted for Milk, and this applies to No_Milk as well, so if a parameter reads simply "no_milk", Milk will be set to False. The default value is True, so most users will probably omit the parameter entirely if they want milk, and write "no_milk" if they don't want milk.

        Milk           => Boolean_Definition_Single
          (Description          => +"Squirted out of a cow.",
           Default              => True,
           Accept_Omitted_Value => True),
        No_Milk        => Antonym (Milk),

Like with booleans, types and functions for string parameters are provided in Orto. Who, Customer and Client will be synonyms, with Customer as the primary name. There is no default value, so the parameter must always be given on the command line (by any one of its synonyms).

        Who            => Synonym (Customer),
        Customer       => String_Definition_Single
          (Description  => +"Who has ordered the tea."),
        Client         => Synonym (Customer),

So far, all parameters have been single-valued. Greetings will be multi-valued, which is specified by calling String_Definition_Multi instead of String_Definition_Single. All the definition functions exist in "_Single" and "_Multi" versions. A multi-valued parameter may be given any number of times, including zero. If you want a multi-valued parameter that must be given at least once, call the definition function with At_Least_One set to True. Multi-valued parameters do not have default values.

        Greeting       => String_Definition_Multi
          (Description  => +"Friendly phrases to say.")));

 end Nutrimatic_Parameters;

The parameter definition is now finished.

Analyzing the command line

Now let's look at the main program. It uses the package where the parameters were defined of course. Use clauses for the generic instances make things easier.

 with Ada.Text_IO;
 with AdaCL.EAstrings; use AdaCL.EAstrings;
 with AdaCL.EAstrings.IO; use AdaCL.EAstrings.IO;
 with AdaCL.EAstrings.Latin_1; use AdaCL.EAstrings.Latin_1;
 with Nutrimatic_Parameters; use Nutrimatic_Parameters;
 use Nutrimatic_Parameters.Handler;
 use Nutrimatic_Parameters.Natural_Param;
 use Nutrimatic_Parameters.Tea_Param;
 use Nutrimatic_Parameters.Volume_Param;

 procedure Nutrimatic is

This program has two procedures. We will look closer at them shortly.

    procedure Brew is
       type Decilitres is digits 4 range 0.0 .. 50.0;
       package dl_IO is new Ada.Text_IO.Float_IO (Decilitres);
       Amount : constant Litres := Value (Volume);
       Volume_String : String (1 .. 5);
       dl_IO.Put (Volume_String, Decilitres (Amount * 10.0), 2, 0);
       Put_Line (+"Brewing " & Volume_String & " dl of " &
                 Teas'Image (Value (Tea)) & " tea.");
       if Value (Sugar) > 0 then
          Put_Line (+"Adding sugar.");
       end if;
       for Lump in 1 .. Value (Sugar) loop
          Put_Line (+"plop");
       end loop;
       if Value (Milk) then
          Put_Line (+"Adding milk.");
       end if;
    end Brew;

    procedure Serve is
       Put_Line ("Here is your tea, " & Value (Customer) & '.');
       for Number in 1 .. Occurrences (Greeting) loop
          Put_Line (Value (Greeting, Number));
       end loop;
    end Serve;

Early in its execution, the program should call the procedure Analyze_Parameters, which is in the generic subpackage Analyze, instantiated as Analyzer in this example. Analyze_Parameters will interpret the command line parameters and store their values. If the parameters on the command line don't fit the definitions, Analyze_Parameters will print error messages to the standard error stream and then raise the exception Stop. The main program only needs to catch Stop and exit.

    Analyzer.Analyze_Parameters (+"Nutri-Matic 2.2");
    when Stop =>
 end Nutrimatic;

The parameters "help" and "version" are recognized automatically, in addition to the defined parameters. If "version" is found on the command line, Analyze_Parameters will print the version string that was given to it as a parameter. If "help" is found, Analyze_Parameters will print a help text which is generated automatically from the parameter definition. Here is what our example program prints:

 > nutrimatic version
 Nutri-Matic 2.2
 > nutrimatic help
 I understand these parameters:

 default: ceylon
 The taste of dried leaves boiled in water.

 volume=0.000E+00..5.000E+00l   default: 1.650E-01l
 How much tea you want.

 lumps_of_sugar=0..2147483647   default: 0
 How many lumps of sugar you want.
 Synonym: sugar

 milk   (default)
 Squirted out of a cow.

 This is the opposite of milk.

 Who has ordered the tea.
 Synonyms: who, client

 greeting=<text>   (multiple)
 Friendly phrases to say.

When the user asks for the version string or the help text like this, they are printed on the standard output stream. The help text is also printed when there are errors on the command line. In that case it is printed on the standard error stream, after the error messages. In all three cases, Stop is raised to indicate that the program should not do its normal job.

It should be noted that you can also define parameters named "help" and "version" explicitly, just like any other parameters. They will then be handled according to your definition, and not treated specially in any way.

Accessing parameter values

Once Analyze_Parameters has executed without errors, the program may start using the parameter values. Values are retrieved by calling the Value functions, as seen in the procedures Brew and Serve of the example program. Thanks to overload resolution based on return type, the right Value function can usually be selected automatically. Here an enumeration value and an integer are retrieved and immediately used in expressions:

       Put_Line (+"Brewing " & Volume_String & " dl of " &
                 Teas'Image (Value (Tea)) & " tea.");
       if Value (Sugar) > 0 then

Values can be retrieved in any order and any number of times. Here Sugar is retrieved again:

       for Lump in 1 .. Value (Sugar) loop

Note that Sugar is a synonym of Lumps_Of_Sugar. Parameters can be referenced by their synonyms and antonyms just as well as by their primary names. As stated above, boolean values are negated when retrieved by antonyms.

For a multi-valued parameter, an array of all the values is available from the function Values. The values are placed in the array in the order they appear on the command line. You can also get the number of values from the function Occurrences and then retrieve a specific value by passing an index to Value:

       for Number in 1 .. Occurrences (Greeting) loop
          Put_Line (Value (Greeting, Number));
       end loop;

We can now run the program like this: (Note the hyphen in "no-milk".)

 > nutrimatic customer=Trillian no-milk sugar=3
 Brewing  1.65 dl of CEYLON tea.
 Adding sugar.
 Here is your tea, Trillian.

To brew a large cup of English Breakfast tea with milk we might run it like this: (Note the decimal comma, and how centilitres is converted to decilitres.)

 > nutrimatic Tea=English_Breakfast Volume=22,5cl Greeting='Share and enjoy!' 
 Who=Arthur Greeting="Will that be all?"
 Brewing  2.25 dl of ENGLISH_BREAKFAST tea.
 Adding milk.
 Here is your tea, Arthur.
 Share and enjoy!
 Will that be all?

Controlling the printing of messages

Sometimes you might want full control over when and how messages are printed. Then there is another version of Analyze_Parameters – a function that you can call instead of the procedure described above. This function doesn't print anything, but returns an Analysis_Result value which indicates whether there were errors and whether a version string or a help text was requested. You can then call the procedures Print_Errors and Print_Help if you like. Don't try to retrieve parameter values if there were errors.

Final notes

The data structure where parameter data is stored is global inside the package. There is no reason to have more than one instance of this structure as there is only one command line anyway. In a program with tasks, several tasks can retrieve parameter values concurrently without risk, as this doesn't make any changes to the global data. Just make sure that Analyze_Parameters has executed without errors before any values are retrieved.

Do not call Analyze_Parameters more than once, and do not instantiate Analyze more than once from the same instance of Orto. It should be safe to make and use several instances of Orto, but there is little reason to do so.

Ada programming, © 2005,2006 the Authors, Content is available under GNU Free Documentation License.