Jump to content

Comparison of programming languages (syntax)

From Wikipedia, the free encyclopedia

This article compares the syntax of many notable programming languages.

Expressions

[edit]

Programming language expressions can be broadly classified into four syntax structures:

prefix notation
  • Lisp (* (+ 2 3) (expt 4 5))
infix notation
suffix, postfix, or Reverse Polish notation
math-like notation
  • TUTOR (2 + 3)(45) $$ note implicit multiply operator

Statement delimitation

[edit]

A language that supports the statement construct typically has rules for one or more of the following aspects:

  • Statement terminator – marks the end of a statement
  • Statement separator – demarcates the boundary between two statements; need needed for the last statement
  • Line continuation – escapes a newline to continue a statement on the next line

Some languages define a special character as a terminator while some, called line-oriented, rely on the newline. Typically, a line-oriented language includes a line continuation feature whereas other languages have no need for line continuation since newline is treated like other whitespace. Some line-oriented languages provide a separator for use between statements on one line.

Language Statement delimitation
ABAP period separated
Ada semicolon terminated
ALGOL semicolon separated
ALGOL 68 semicolon and comma separated[1]
APL newline terminated, [Direct_function ⋄] separated
AppleScript newline terminated
AutoHotkey newline terminated
BASIC newline terminated, colon separated
Boo newline terminated
C semicolon terminated, comma separated expressions
C++ semicolon terminated, comma separated expressions
C# semicolon terminated
COBOL whitespace separated, sometimes period separated, optionally separated with commas and semi-colons
Cobra newline terminated
CoffeeScript newline terminated
CSS semicolon terminated
D semicolon terminated
Eiffel newline terminated, semicolon separated
Erlang colon separated, period terminated
F# newline terminated, semicolon
Fortran newline terminated, semicolon separated
Forth semicolons terminate word definitions; space terminates word use
GFA BASIC newline terminated
Go semicolon separated (inserted by compiler)
Haskell in do-notation: newline separated,
in do-notation with braces: semicolon separated
Java semicolon terminated
JavaScript semicolon separated (but often inserted as statement terminator)
Kotlin semicolon separated (but sometimes implicitly inserted on newlines)
Lua whitespace separated (semicolon optional)
Mathematica a.k.a. Wolfram semicolon separated
MATLAB newline terminated, separated by semicolon or comma (semicolon – result of receding statement hidden, comma – result displayed)
MUMPS a.k.a. M newline terminates line-scope, the closest to a "statement" that M has, a space separates/terminates a command, allowing another command to follow
Nim newline terminated
Object Pascal (Delphi) semicolon separated
Objective-C semicolon terminated
OCaml semicolon separated
Pascal semicolon separated
Perl semicolon separated
PHP semicolon terminated
Pick Basic newline terminated, semicolon separated
PowerShell newline terminated, semicolon separated
Prolog comma separated (conjunction), semicolon separated (disjunction), period terminated (clause)
Python newline terminated, semicolon separated
R newline terminated, semicolon separated[2]
Raku semicolon separated
Red whitespace separated
Ruby newline terminated, semicolon separated
Rust semicolon terminated, comma separates expressions
Scala newline terminated, semicolon separator
Seed7 semicolon separated (semicolon termination is allowed)
Simula semicolon separated
S-Lang semicolon separated
Smalltalk period separated
Standard ML semicolon separated
Swift semicolon separated (inserted by compiler)
V (Vlang) newline terminated, comma or semicolon separated
Visual Basic newline terminated, colon separated
Visual Basic .NET newline terminated, colon separated
Xojo newline terminated
Zig semicolon terminated

Line continuation

[edit]

Listed below are notable line-oriented languages that provide for line continuation. Unless otherwise noted the continuation marker must be the last text of the line.

Ampersand
Backslash
Backtick
Hyphen
Underscore
Ellipsis (three dots)
  • MATLAB: The ellipsis need not end the line, but text following it is ignored.[5] It begins a comment that extends through (including) the first subsequent newline. Contrast this with a line comment which extends until the next newline.
Comma delimiter
  • Ruby: comment may follow delimiter
Left bracket delimiter
Operator symbol
  • Ruby: as last object of line; comment may follow operator
  • AutoHotkey: As the first character of continued line; any expression operators except ++ and --, and a comma or a period[7]
Some form of line comment serves as line continuation
Character position
  • Fortran 77: A non-comment line is a continuation of the prior non-comment line if any non-space character appears in column 6. Comment lines cannot be continued.
  • COBOL: String constants may be continued by not ending the original string in a PICTURE clause with ', then inserting a - in column 7 (same position as the * for comment is used.)
  • TUTOR: Lines starting with a tab (after any indentation required by the context) continue the prior command.

The C compiler concatenates adjacent string literals even if on separate lines, but this is not line continuation syntax as it works the same regardless of the kind of whitespace between the literals.

Consuming external software

[edit]

Languages support a variety of ways to reference and consume other software in the syntax of the language. In some cases this is importing the exported functionality of a library, package or module but some mechanisms are simpler text file include operations.

Import can be classified by level (module, package, class, procedure,...) and by syntax (directive name, attributes,...).

File include
  • #include <filename> or #include "filename"C preprocessor used in conjunction with C and C++ and other development tools
File import
Package import
Class import
  • from module import ClassPython
  • import package.classJava, MATLAB, kotlin
  • import class from "modname";JavaScript
  • import {class} from "modname";JavaScript
  • import {class as altname} from "modname";JavaScript
  • import package.classScala
  • import package.{ class1 => alternativeName, class2 }Scala
  • import package._Scala
  • use Namespace\ClassName;PHP
  • use Namespace\ClassName as AliasName;PHP
Procedure/function import
  • from module import functionPython
  • import package.module : symbol;D
  • import package.module : altsymbolname = symbol;D
  • import Module (function)Haskell
  • import function from "modname";JavaScript
  • import {function} from "modname";JavaScript
  • import {function as altname} from "modname";JavaScript
  • import package.functionMATLAB
  • import package.class.functionScala
  • import package.class.{ function => alternativeName, otherFunction }Scala
  • use Module ('symbol');Perl
  • use function Namespace\function_name;PHP
  • use Namespace\function_name as function_alias_name;PHP
  • use module::submodule::symbol;Rust
  • use module::submodule::{symbol1, symbol2};Rust
  • use module::submodule::symbol as altname;Rust
Constant import
  • use const Namespace\CONST_NAME;PHP

The above statements can also be classified by whether they are a syntactic convenience (allowing things to be referred to by a shorter name, but they can still be referred to by some fully qualified name without import), or whether they are actually required to access the code (without which it is impossible to access the code, even with fully qualified names).

Syntactic convenience
  • import package.* Java
  • import package.class Java
  • open module OCaml
Required to access code
  • import altname "package/name" Go
  • import altname from "modname";JavaScript
  • import modulePython

Block delimitation

[edit]

A block is a grouping of code that is treated collectively. Many block syntaxes can consist of any number of items (statements, expressions or other units of code) – including one or zero. Languages delimit a block in a variety of ways – some via marking text and others by relative formatting such as levels of indentation.

Curley braces (a.k.a. curly brackets) { ... }
  • Curly brace languages: A defining aspect of curly brace languages is that they use curly braces to delimit a block.
Parentheses ( ... )
Square brackets [ ... ]
begin ... end
do ... end
do ... done
do ... end
  • Lua, Ruby (pass blocks as arguments, for loop), Seed7 (encloses loop bodies between do and end)
X ... end (e.g. if ... end):
  • Ruby (if, while, until, def, class, module statements), OCaml (for & while loops), MATLAB (if & switch conditionals, for & while loops, try clause, package, classdef, properties, methods, events, & function blocks), Lua (then / else & function)
(begin ...)
(progn ...)
(do ...)
Indentation
Others

Comments

[edit]

With respect to a language definition, the syntax of Comments can be classified many ways, including:

  • Line vs. block – a line comment starts with a delimiter and continues to the end of the line (newline marker) whereas a block comment starts with one delimiter and ends with another and can cross lines
  • Nestable – whether a block comment can be inside another block comment
  • How parsed with respect to the language; tools (including compilers and interpreters) may also parse comments but that may be outside the language definition

Other ways to categorize comments that are outside a language definition:

  • Inline vs. prologue – an inline comment follows code on the same line and a prologue comment precedes program code to which it pertains; line or block comments can be used as either inline or prologue
  • Support for API documentation generation which is outside a language definition

Line comment

[edit]
Symbol Languages
C Fortran I to Fortran 77 (C in column 1)
REM BASIC, Batch files, Visual Basic
:: Batch files, COMMAND.COM, cmd.exe
NB. J; from the (historically) common abbreviation Nota bene, the Latin for "note well".
APL; the mnemonic is that the glyph (jot overstruck with shoe-down) resembles a desk lamp, and hence "illuminates" the foregoing.
# Boo, Bourne shell and other UNIX shells, Cobra, Perl, Python, Ruby, Seed7, PowerShell, PHP, R, Make, Maple, Elixir, Julia, Nim[10]
% TeX, Prolog, MATLAB,[11] Erlang, S-Lang, Visual Prolog, PostScript
// ActionScript, Boo, C (C99), C++, C#, D, F#, Go, Java, JavaScript, Kotlin, Object Pascal (Delphi), Objective-C, PHP, Rust, Scala, Sass, Swift, Xojo, V (Vlang), Zig
' Monkey, Visual Basic, VBScript, Small Basic, Gambas, Xojo
! Factor, Fortran, Basic Plus, Inform, Pick Basic
; Most assembly languages, AutoHotkey, AutoIt, Lisp, Common Lisp, Clojure, PGN, Rebol, Red, Scheme
-- Euphoria, Haskell, SQL, Ada, AppleScript, Eiffel, Lua, VHDL, SGML, PureScript, Elm
* Assembler S/360 (* in column 1), COBOL I to COBOL 85, PAW, Fortran IV to Fortran 77 (* in column 1), Pick Basic, GAMS (* in column 1)
|| Curl
" Vimscript, ABAP
\ Forth
*> COBOL 90

Block comment

[edit]

In these examples, ~ represents the comment content, and the text around it are the delimiters. Whitespace (including newline) is not considered delimiters.

Syntax Languages
comment ~ ; ALGOL 60, SIMULA
¢ ~ ¢,
# ~ #, co ~ co,
comment ~ comment
ALGOL 68[12][13]
/* ~ */ ActionScript, AutoHotkey, C, C++, C#, D,[14] Go, Java, JavaScript, Kotlin, Objective-C, PHP, PL/I, Prolog, Rexx, Rust (can be nested), Scala (can be nested), SAS, SASS, SQL, Swift (can be nested), V (Vlang), Visual Prolog, CSS
#cs ~ #ce AutoIt[15]
/+ ~ +/ D (can be nested)[14]
/# ~ #/ Cobra (can be nested)
<# ~ #> PowerShell
<!-- ~ --> HTML, XML
=begin ~ =cut Perl (Plain Old Documentation)
#`( ~ ) Raku (bracketing characters can be (), <>, {}, [], any Unicode characters with BiDi mirrorings, or Unicode characters with Ps/Pe/Pi/Pf properties)
=begin ~ =end Ruby
#<TAG> ~ #</TAG>, #stop ~ EOF,
#iffalse ~ #endif, #ifntrue ~ #endif,
#if false ~ #endif, #if !true ~ #endif
S-Lang[16]
{- ~ -} Haskell (can be nested)
(* ~ *) Delphi, ML, Mathematica, Object Pascal, Pascal, Seed7, AppleScript, OCaml (can be nested), Standard ML (can be nested), Maple, Newspeak, F#
{ ~ } Delphi, Object Pascal, Pascal, PGN, Red
{# ~ #} Nunjucks, Twig
{{! ~ }} Mustache, Handlebars
{{!-- ~ --}} Handlebars (cannot be nested, but may contain {{ and }})
|# ~ #| Curl
%{ ~ %} MATLAB[11] (the symbols must be in a separate line)
#| ~ |# Lisp, Scheme, Racket (can be nested in all three).
#= ~ =# Julia[17]
#[ ~ ]# Nim[18]
--[[ ~ ]],
--[=[ ~ ]=],
--[=...=[ ~ ]=...=]
Lua (brackets can have any number of matching = characters; can be nested within non-matching delimiters)
" ~ " Smalltalk
(comment ~ ) Clojure
#If COMMENT Then ~ #End If[a] Visual Basic .NET
#if COMMENT ~ #endif[b] C#
' comment _ or REM comment _[c] Classic Visual Basic, VBA, VBScript

Unique variants

[edit]
Fortran

Indenting lines in Fortran 66/77 is significant. The actual statement is in columns 7 through 72 of a line. Any non-space character in column 6 indicates that this line is a continuation of the prior line. A 'C' in column 1 indicates that this entire line is a comment. Columns 1 though 5 may contain a number which serves as a label. Columns 73 though 80 are ignored and may be used for comments; in the days of punched cards, these columns often contained a sequence number so that the deck of cards could be sorted into the correct order if someone accidentally dropped the cards. Fortran 90 removed the need for the indentation rule and added line comments, using the ! character as the comment delimiter.

COBOL

In fixed format code, line indentation is significant. Columns 1–6 and columns from 73 onwards are ignored. If a * or / is in column 7, then that line is a comment. Until COBOL 2002, if a D or d was in column 7, it would define a "debugging line" which would be ignored unless the compiler was instructed to compile it.

Cobra

Cobra supports block comments with "/# ... #/" which is like the "/* ... */" often found in C-based languages, but with two differences. The # character is reused from the single-line comment form "# ...", and the block comments can be nested which is convenient for commenting out large blocks of code.

Curl

Curl supports block comments with user-defined tags as in |foo# ... #foo|.

Lua

Like raw strings, there can be any number of equals signs between the square brackets, provided both the opening and closing tags have a matching number of equals signs; this allows nesting as long as nested block comments/raw strings use a different number of equals signs than their enclosing comment: --[[comment --[=[ nested comment ]=] ]]. Lua discards the first newline (if present) that directly follows the opening tag.

Perl

Block comments in Perl are considered part of the documentation, and are given the name Plain Old Documentation (POD). Technically, Perl does not have a convention for including block comments in source code, but POD is routinely used as a workaround.

PHP

PHP supports standard C/C++ style comments, but supports Perl style as well.

Python

The use of the triple-quotes to comment-out lines of source, does not actually form a comment.[19] The enclosed text becomes a string literal, which Python usually ignores (except when it is the first statement in the body of a module, class or function; see docstring).

Elixir

The above trick used in Python also works in Elixir, but the compiler will throw a warning if it spots this. To suppress the warning, one would need to prepend the sigil ~S (which prevents string interpolation) to the triple-quoted string, leading to the final construct ~S""" ... """. In addition, Elixir supports a limited form of block comments as an official language feature, but as in Perl, this construct is entirely intended to write documentation. Unlike in Perl, it cannot be used as a workaround, being limited to certain parts of the code and throwing errors or even suppressing functions if used elsewhere.[20]

Raku

Raku uses #`(...) to denote block comments.[21] Raku actually allows the use of any "right" and "left" paired brackets after #` (i.e. #`(...), #`[...], #`{...}, #`<...>, and even the more complicated #`{{...}} are all valid block comments). Brackets are also allowed to be nested inside comments (i.e. #`{ a { b } c } goes to the last closing brace).

Ruby

Block comment in Ruby opens at =begin line and closes at =end line.

S-Lang

The region of lines enclosed by the #<tag> and #</tag> delimiters are ignored by the interpreter. The tag name can be any sequence of alphanumeric characters that may be used to indicate how the enclosed block is to be deciphered. For example, #<latex> could indicate the start of a block of LaTeX formatted documentation.

Scheme and Racket

The next complete syntactic component (s-expression) can be commented out with #; .

ABAP

ABAP supports two different kinds of comments. If the first character of a line, including indentation, is an asterisk (*) the whole line is considered as a comment, while a single double quote (") begins an in-line comment which acts until the end of the line. ABAP comments are not possible between the statements EXEC SQL and ENDEXEC because Native SQL has other usages for these characters. In the most SQL dialects the double dash (--) can be used instead.

Esoteric languages

Many esoteric programming languages follow the convention that any text not executed by the instruction pointer (e.g., Befunge) or otherwise assigned a meaning (e.g., Brainfuck), is considered a "comment".

Comment comparison

[edit]

There is a wide variety of syntax styles for declaring comments in source code. BlockComment in italics is used here to indicate block comment style. LineComment in italics is used here to indicate line comment style.

Language In-line comment Block comment
Ada, Eiffel, Euphoria, Occam, SPARK, ANSI SQL, and VHDL -- LineComment
ALGOL 60 comment BlockComment;
ALGOL 68 ¢ BlockComment ¢

comment BlockComment comment
co BlockComment co
# BlockComment #
£ BlockComment £

APL LineComment
AppleScript -- LineComment (* BlockComment *)
Assembly language (varies) ; LineComment   one example (most assembly languages use line comments only)
AutoHotkey ; LineComment /* BlockComment */
AWK, Bourne shell, C shell, Maple, PowerShell # LineComment <# BlockComment #>
Bash # LineComment <<EOF
BlockComment
EOF


: '
BlockComment
'
BASIC (various dialects): 'LineComment (not all dialects)

*LineComment (not all dialects)
!LineComment (not all dialects)
REM LineComment

C (K&R, ANSI/C89/C90), CHILL, PL/I, REXX /* BlockComment */
C (C99), C++, Go, Swift, JavaScript, V (Vlang) // LineComment /* BlockComment */
C# // LineComment
/// LineComment (XML documentation comment)
/* BlockComment */
/** BlockComment */ (XML documentation comment)
#if COMMENT
  BlockComment
#endif
(Compiler directive)[b]
COBOL I to COBOL 85 * LineComment (* in column 7)
COBOL 2002 *> LineComment
Curl || LineComment |# BlockComment #|

|foo# BlockComment #|

Cobra # LineComment /# BlockComment #/ (nestable)
D // LineComment
/// Documentation LineComment (ddoc comments)
/* BlockComment */
/** Documentation BlockComment */ (ddoc comments)

/+ BlockComment +/ (nestable)
/++ Documentation BlockComment +/ (nestable, ddoc comments)

DCL $! LineComment
ECMAScript (JavaScript, ActionScript, etc.) // LineComment /* BlockComment */
Elixir # LineComment ~S"""
BlockComment
"""

@doc """
BlockComment
"""
(Documentation, only works in modules)
@moduledoc
BlockComment
"""
(Module documentation)
@typedoc
BlockComment
"""
(Type documentation)
Forth \ LineComment ( BlockComment ) (single line and multiline)

( before -- after ) stack comment convention

FORTRAN I to FORTRAN 77 C LineComment (C in column 1)
Fortran 90 and later ! LineComment #if 0
  BlockComment
#endif
[d]
Haskell -- LineComment {- BlockComment -}
J NB.
Java // LineComment /* BlockComment */

/** BlockComment */ (Javadoc documentation comment)

Julia # LineComment #= BlockComment =#
Lisp, Scheme ; LineComment #| BlockComment |#
Lua -- LineComment --[==[ BlockComment]==] (variable number of = signs, nestable with delimiters with different numbers of = signs)
Maple # LineComment (* BlockComment *)
Mathematica (* BlockComment *)
Matlab % LineComment %{
BlockComment (nestable)
%}

Note: Both percent–bracket symbols must be the only non-whitespace characters on their respective lines.
Nim # LineComment #[ BlockComment ]#
Object Pascal // LineComment (* BlockComment *)
{ BlockComment }
OCaml (* BlockComment (* nestable *) *)
Pascal, Modula-2, Modula-3, Oberon, ML: (* BlockComment *)
Perl, Ruby # LineComment =begin
BlockComment
=cut
(=end in Ruby) (POD documentation comment)

__END__
Comments after end of code

PGN, Red ; LineComment { BlockComment }
PHP # LineComment
// LineComment
/* BlockComment */
/** Documentation BlockComment */ (PHP Doc comments)
PILOT R:LineComment
PLZ/SYS ! BlockComment !
PL/SQL, TSQL -- LineComment /* BlockComment */
Prolog % LineComment /* BlockComment */
Python # LineComment ''' BlockComment '''
""" BlockComment """

(Documentation string when first line of module, class, method, or function)

R # LineComment
Raku # LineComment #`{
BlockComment
}

=comment
    This comment paragraph goes until the next POD directive
    or the first blank line.
[23][24]

Rust // LineComment

/// LineComment ("Outer" rustdoc comment)
//! LineComment ("Inner" rustdoc comment)

/* BlockComment */ (nestable)

/** BlockComment */ ("Outer" rustdoc comment)
/*! BlockComment */ ("Inner" rustdoc comment)

SAS * BlockComment;
/* BlockComment */
Seed7 # LineComment (* BlockComment *)
Simula comment BlockComment;
! BlockComment;
Smalltalk "BlockComment"
Smarty {* BlockComment *}
Standard ML (* BlockComment *)
TeX, LaTeX, PostScript, Erlang, S-Lang % LineComment
Texinfo @c LineComment

@comment LineComment

TUTOR * LineComment
command $$ LineComment
Visual Basic ' LineComment
Rem LineComment
' BlockComment _
BlockComment

Rem BlockComment _
BlockComment
[c]
Visual Basic .NET ' LineComment

''' LineComment (XML documentation comment)
Rem LineComment

#If COMMENT Then
  BlockComment
#End If
Visual Prolog % LineComment /* BlockComment */
Wolfram Language (* BlockComment *)
Xojo ' LineComment
// LineComment
rem LineComment
Zig // LineComment
/// LineComment
//! LineComment

See also

[edit]

References

[edit]
  1. ^ Three different kinds of clauses, each separates phrases and the units differently:
      1. serial-clause using go-on-token (viz. semicolon): begin a; b; c end – units are executed in order.
      2. collateral-clause using and-also-token (viz. ","): begin a, b, c end – order of execution is to be optimised by the compiler.
      3. parallel-clause using and-also-token (viz. ","): par begin a, b, c end – units must be run in parallel threads.
  2. ^ From the R Language Definition, section 3.2 Control structures: "A semicolon always indicates the end of a statement while a new line may indicate the end of a statement. If the current statement is not syntactically complete new lines are simply ignored by the evaluator."
  3. ^ Bash Reference Manual, 3.1.2.1 Escape Character
  4. ^ Python Documentation, 2. Lexical analysis: 2.1.5. Explicit line joining
  5. ^ "Mathworks.com". Archived from the original on 7 February 2010.
  6. ^ "Parenthesis/Brackets - Windows CMD - SS64.com". ss64.com.
  7. ^ "Scripts - Definition & Usage | AutoHotkey".
  8. ^ For an M-file (MATLAB source) to be accessible by name, its parent directory must be in the search path (or current directory).
  9. ^ a b c "Verbose Syntax - F# | Microsoft Learn". Microsoft Learn. 5 November 2021. Retrieved 17 November 2022.
  10. ^ "Nim Manual". nim-lang.org.
  11. ^ a b "Mathworks.com". Archived from the original on 21 November 2013. Retrieved 25 June 2013.
  12. ^ "Algol68_revised_report-AB.pdf on PDF pp. 61–62, original document pp. 121–122" (PDF). Retrieved 27 May 2014.
  13. ^ "HTML Version of the Algol68 Revised Report AB". Archived from the original on 17 March 2013. Retrieved 27 May 2014.
  14. ^ a b "DLang.org, Lexical". Retrieved 27 May 2014.
  15. ^ "AutoItScript.com Keyword Reference, #comments-start". Retrieved 27 May 2014.
  16. ^ "slang-2.2.4/src/slprepr.c – line 43 to 113". Retrieved 28 May 2014.
  17. ^ "Punctuation · The Julia Language".
  18. ^ "Nim Manual". nim-lang.org.
  19. ^ "Python tip: You can use multi-line strings as multi-line comments", 11 September 2011, Guido van Rossum
  20. ^ "Writing Documentation — Elixir v1.12.3". Retrieved 28 July 2023.
  21. ^ "Perl 6 Documentation (Syntax)". docs.perl6.org. Comments. Retrieved 5 April 2017.
  22. ^ "Using the FPP Preprocessor". Archived from the original on 18 November 2022. Retrieved 18 November 2022.
  23. ^ "Perl 6 POD Comments". 25 May 2023.
  24. ^ "Perl 6 POD (Abbreviated Blocks)". 25 May 2023.

Notes

[edit]
  1. ^ Visual Basic .NET does not support traditional multi-line comments, but they can be emulated through compiler directives.
  2. ^ a b While C# supports traditional block comments /* ... */, compiler directives can be used to mimic them just as in VB.NET.
  3. ^ a b The line continuation character _ can be used to extend a single-line comment to the next line without needing to type ' or REM again. This can be done up to 24 times in a row.
  4. ^ Fortran does not support traditional block comments, but some compilers support preprocessor directives in the style of C/C++, allowing a programmer to emulate multi-line comments.[22]