Jump to content

User:Petelomax/sandbox

From Wikipedia, the free encyclopedia


Phix
Phix logo
ParadigmImperative, procedural, object-oriented
DeveloperPete Lomax
First appeared2006; 18 years ago (2006)
Stable release
0.8.2 / November 24, 2020; 4 years ago (2020-11-24)
Typing disciplinestatic, dynamic, strong, duck
Implementation languageself-hosted
OSWindows, Linux, Browser
LicenseOpen Software License 3.0 / Academic Free License 3.0
Filename extensions.e, .ex, .exw, .edb
Websitephix.x10.mx
Influenced by
Euphoria

Phix is an open source, self-hosted, interpreted or compiled programming language with a strong emphasis on simplicity and plain human-readable error messages.

History

[edit]

Robert Craig released the first version of Euphoria in July 1993.
In 2006, Pete Lomax created Phix as a clone of Euphoria, initialy relying on a closed-source backend written in FASM.
By 2013 the backend had been transferred to (open-source) inline assembly, which removed that dependency completely.

-Significant releases in the past four years-
Version Date Key Features
0.7.6 Jul 2017 try/catch,regex,json
0.7.7 Feb 2018 ipc,serialise
0.7.8 Mar 2018 zip
0.7.9 Apr 2018 (maintenance release)
0.8.0 Apr 2019 gmp,xml,sqlite
0.8.1 Mar 2020 structs, classes
0.8.2 Nov 2020 unit testing

Versions 0.8.0/1/2 have been retrospectively nominated 1.0.0/1/2, though the binaries and documentation retain the above numbers.
The next release will be 1.0.3, expected mid-2021.

Overview

[edit]

The design of Phix is based on four tenets: simplicity, clarity, consistency, and making debugging easier, rather than clinging to the vain hope of avoiding it altogether.

  • The five core data types can be remembered on the fingers (and thumb) of one hand, see below.
  • Phix implements automatic garbage collection, even for manually allocated raw memory, albeit the latter is optional and defaults to false.
  • Programs can be interpreted or compiled. Interpretation is fast as it builds the same machine code as compilation, just executes it directly in memory.
  • Compilation of a program you've been interpreting is as simple as adding a -c switch, eg 'p test' --> 'p -c test': command lines are kept dirt simple.
  • Compiler and run-time errors are made as human-readable as possible, and always include the offending source file name and line number.
  • Incorporates both a source-level debugger with single-stepping and the ability to enable/disable on selected blocks/files, and an execution profiler.
  • Strings are fully mutable with variable length slice substitution. Sequences can grow and shrink at will with no manual housekeeping.
  • Extensive run-time checking occurs for out-of-bounds subscripts, uninitialized variables, inappropriate types (eg trying to store a string in an integer), etc.
  • The switch statement requires an explict fallthrough statement when that is desired, rather than punishing an accidentally omitted break statement.
  • Does not natively have the kind of "maths" whereby 255 plus 1 is 0, 0-1 is +4GB, or 127+1 is -128, although such is achievable via and_bits() [gmp rqd for > 32 bits].
  • Consistent operators, eg the Phix & operator always concatenates, and the + operator always adds, unlike say JavaScript[1].
  • Explicitly tagged ends, such as "if ... then ... end if", while more verbose than {}, catch more errors and avoid problems such as the dangling else.
  • The Windows installation includes a complete graphical user interface (pGUI[2]/IUP[3]) out of the box, though some additional setup is required on Linux.
  • Work is in progress[4],[5] on a new wrapper for GTK[6] (est completion Q1 2022), and several older (usually 32-bit only) wrappers exist[7].
  • Phix has a simple built-in database[8] and SQlite[9],[10], and wrappers for a variety of other databases[11].
  • Components such as ipc[12], json[13], curl[14], zip[15], gmp[16], regular expressions[17], sockets[18], and unit testing[19] are also bundled in with Phix (currently 30MB).
  • An archive of some 2000 other user contributions[20] is available online, as well as about 500 bundled demos and over 1250 entries on rosettacode[21].
  • The compiler generates executable binary files directly without the need to install any other compiler or linker.
  • To re-compile Phix from source simply run `p -cp`. No other compilers or similar tools need to be installed.
  • Available for Windows and Linux, in both 32 and 64 bit versions.
  • Work on a Transpiler to JavaScript is in progress (est completion Q3 2021), so that programs can be run in a browser.

Five core types

[edit]

Phix has just five builtin data types:

       <-------- object --------->
       |                |
       +-number         +-sequence
         |                |
         +-integer        +-string

• An object can hold any value, specifically either a number or a sequence.
• A number (aka atom) can hold a single floating point numeric value, or an integer.
• An integer can hold a single whole number (at least +/- 1,000,000,000).
• A sequence can hold a collection of values, of any length, nested to any depth, or a string
• A string can hold a series of characters, or raw binary data, and is fully mutable.

The underlying memory allocation and deallocation is handled automatically via reference counting.

Note that in Phix the term 'object' has no special relationship to object-orientated programming.

Further user-defined types can also be defined, for instance dictionaries and class instances, however they are actually just specialised versions of the above.

The number type has recently been added as an alias of the traditional but less intuitive atom, and will not be available until 1.0.3 is shipped.

1-based indexes

[edit]

The widespread use of 0-based indexes[22] stems from a paper[23] by Edsger W. Dijkstra and the benefits for pointer arithmetic[24], which Phix does not have.
There have been many other debates[25][26][27][28] about the use of 0-based vs. 1-based indexing.
In 0-based indexing the first,last,length are 0,n-1,n whereas in 1-based indexing they are 1,n,n.
If the first element of an array is a[0], you cannot index the last as a[-0], whereas a[1] and a[-1] are much more reasonable.
In a language with both 0-based indices and negative indices, apart from the above disjoin, there is no sensible value that eg find()
can return that is a non-functional index, whereas in Phix, find() can return 0 and that won't accidentally retrieve the wrong element.
In Phix, s[5..5] has a length of 1, and s[5..4] a length of 0, whereas in Python s[5:5] has a length of 0, it is s[5:6] that has a length of 1.
You don't need to explain 1-based indexing to anyone and you certainly don't need a diagram.
For these reasons, Phix uses 1-based indexing, similar to Julia[29] and Lua[28].

In "What’s the big deal? 0 vs 1 based indexing"[30] Steven Sagaert provides a simple explanation (for Julia):

That’s exactly it. 
1-based indexing is actual indexing like in mathematics, 
while 0-based “indexing” isn’t indexing at all but pointer arithmetic. 
This comes from C where an array is just syntactic sugar for a pointer. 
One-based indexing allows extensive and intuitive operations on sequences.

Parameter passing

[edit]

Core types

[edit]

Integers are always passed by value, whereas number/sequence/string are passed by reference with copy-on-write semantics. That realises the performance benefits of pass-by-reference but with the behaviour of pass-by-value, for example:

function set55(sequence s)
   s[5][5] = 5 
   return s 
end function

sequence a = repeat(repeat(0,5),5), -- a 5x5 matrix
         b = set55(a)

The value of a is not changed by the call, b obviously ends up with a single 5 in the lower right. In fact it only "clones" the top-level of b and b[5], b[1..4] actually still share the same memory as a, but will cease doing that if and when further modifications are made to b. It will not need to re-clone the top-level of b or b[5] again (and now ditto a) but can instead update those non-shared parts in situ. The fact that non-integer numbers quiely undergo a similar treatment should be of little or no practical concern, expect perhaps to note that an additional indirection and/or memory allocation is often associated with floating point operations.

Automatic PBR optimisation

[edit]

When a local variable is both passed as an argument to a routine and explicitly reassigned on return, the compiler applies a special optimisation by which the local variable becomes undefined over the call, leaving the parameter with a refcount of 1 and therefore amenable to in-situ modification. Even when this does not occur, all parameters can be modified locally (i.e., within the callee) which is implemented very efficiently as sequences (and sub-sequences) have automatic copy-on-write semantics. In other words, when a sequence is passed to a routine, initially only a reference to it is passed, but at the point it is modified the very minimum necessary internal cloning is automatically performed and the routine only updates the now-uniquely-owned parts, leaving the original intact, albeit with some reduced hidden reference counts. In the above, a and b end up looking quite different but in fact still share 80% of their (lowest-level) contents.

Reference types

[edit]

In contrast, dictionary and struct/class instance parameters are more accurately described as pass-by-reference. Again the reality is slightly different, similar to Java[31], for example:

class mytest
  public string s
end class

procedure change(mytest c, d)
  c = new({"c"})
  d.s = "d"
end procedure

mytest a = new({"a"}),
       b = new({"b"})
change(a,b)
?a.s -- "a"
?b.s -- "d"

In practice, a remains unmodified, because we simply overwrote the reference to it, whereas b is modified indirectly as and when the field of d is updated. Note that had you instead performed delete(c), then a would have instantly become meaningless and invalid, and further that a cannot be nullified within change() other than by making it a function which returns (c or null), and then explicitly assigning that return back to a (and you might then want to make class mytest nullable).

One other (fairly obvious) true thing to say is that new() creates space to hold values, and you can never hold more values, that is, in instance variables, than the number of times you have invoked new(). A very similar program but with no classes and all strings could at one point hold four different strings in a,b,c,d, but that is simply not possible in the above because new() has only been called three times.

Examples

[edit]

Line comments start with a double hyphen -- or a C-style // and continue to the end of line.
Block comments start with /* or --/* and end with */ or --*/, and can be nested. Euphoria treats the -- style as a line comment, which is why two types exist.

Hello world, console

[edit]
puts(1, "Hello, World!\n")

Hello world, GUI

[edit]
include pGUI.e
IupOpen()
IupShow(IupDialog(IupVbox({IupLabel("World!")},"MARGIN=90x20"),"TITLE=Hello"))
if platform()!=JS then
  IupMainLoop()
  IupClose()
end if

Note the browser screenshot is from a close-ish proof-of-concept, rather than actual output from the currently in progress transpiler (est Q3 2021).

Simple function

[edit]
include pGUI.e  -- for CD_DEG2RAD

function deg_to_rad(number deg)
    return deg*CD_DEG2RAD
end function
?deg_to_rad(180)        -- outputs 3.141592654
{} = deg_to_rad(360)    -- explict discard rqd

User defined types

[edit]
type index(object x)
    return integer(x) and x>0
end type
index ndx = 5
ndx = -2    -- run-time error "type check failure, ndx is -2", plus source code file name and line number

Note that user defined types are used primarily for validation and debugging purposes, rather than being fundamentally different to the five core builtin types.

String mutation

[edit]

Strings are fully mutable, with variable length slice substitution:

string s = "food"  ?s   -- outputs "food" 
s[2..3] = "e"      ?s   -- outputs "fed" 
s[2..1] = "east"   ?s   -- outputs "feasted"

Exception handling

[edit]
try 
   integer i = 1/0 
   -- or, for example, throw("file in use") 
catch e 
   ?e[E_USER] 
end try 
puts(1,"still running...\n") 

Output:

"attempt to divide by 0" -- or "file in use" 
still running...

Filtering

[edit]
function odd(integer a) return remainder(a,2)=1 end function  
function even(integer a) return remainder(a,2)=0 end function  
   
?filter(tagset(10),odd)     -- ==> {1,3,5,7,9}  
?filter(tagset(10),even)    -- ==> {2,4,6,8,10}  

Version control

[edit]

Specify what versions and/or operating systems are required to run the source code:

requires("0.8.2")           -- crashes on 0.8.1 and earlier 
requires(WINDOWS)           -- crashes on Linux 
requires(64)                -- crashes on 32-bit

In the latter case, if you try to run a 64-bit-only program with a 32-bit interpreter, it will try to locate a suitable 64-bit interpreter and offer to re-run it with that (and vice-versa, for instance the older arwen and win32lib libraries are 32-bit only).

Unit testing

[edit]
test_equal(2+2,4,"2+2 is not 4 !!!!") 
test_summary() 

If all goes well, no output is shown, and the program carries on normally.
You can easily force a summary to be output, quietly create a logfile, etc.[19]

Debugging

[edit]

Phix takes the pragmatic view that debugging is inevitable and essential and should be as user-friendly as possible.
In short, Phix strives to provide an intuitive version of the features you never knew gdb had.[32]

Error reporting

[edit]

When a Phix program crashes, it produces a human-readable file, ex.err, which contains the full call stack and the value of
every variable at that point. These can be quite large, but the most pertinent information is typically at the start of the file.
Error messages are made as clear as possible, for example

 C:\Program Files (x86)\Phix\demo\ilest.exw:43 in function strip_comments()
 type check failure, line is {"--","-- builtins/assert.e (an autoinclude)..

At that particular point line was supposed to be a string, not a list of strings.
Where possible, of course, the compiler tries to pre-empt that kind of run-time error with a compile-time error, eg

 C:\Program Files (x86)\Phix\demo\ilest.exw:43 in function strip_comments()
 line = 5
      ^ type error (storing atom in string)

Source level tracing

[edit]

Place "with trace" before any section(s) of code you want to step through, and "without trace" before any you want to skip (which
can be an entire block of include statements, typically there is not much point single-stepping through any of the tried-and-tested
standard includes that are pre-installed with Phix) and something like "if i=1234 then trace(1) end if" at some appropriate point.

The program will then run until the condition (i=1234) is met, before single-stepping through the subsequent code.

Source level tracing in Phix

Type based debugging

[edit]

Suppose some table t has the contents {12.35, 15.87, 17.17, ..} at some point of failure, but you were expecting t[3] to be 17.57.
It would normally be very helpful to know where exactly the wrong contents were placed in that table. Edit and re-run with say:

--sequence t = {}
type badtable(sequence t)
   if length(t)>=3 and t[3]<17.2 then
      ?9/0
   end if
   return true
end type
badtable t = {}

That will crash at the required point, producing an ex.err, alternatively you could trace(1) to start source-level tracing instead.
Note that "without typecheck" directives in the source code can, fairly obviously, completely disable this technique.

Feature summary

[edit]

Paradigms: imperative, procedural, optionally object-oriented

Standardized: No, the manual includes the language specification

Type strength: strong

Type safety: safe

Expression of types: explicit, partially implicit

Type compatibility: duck

Type checking: dynamic, static

Parameter Passing Methods Available: copy on write, immutable reference, multiple returns

Garbage collection: Reference counting

Intended use: Application, Educational, General, High-level scripting, Text processing

Design goals: Simplicity, Readability, Ease of use

Unsupported features

[edit]

Phix does not (although most can be emulated) directly support operator/builtin/function overloading, lambda expressions, closures, currying, eval, partial function application, function composition, function prototyping, monads, generators, anonymous recursion, the Y combinator, aspect oriented programming, interfaces, delegates, first class environments, implicit type conversion (of the destructive kind), interactive programming, inverted syntax, list comprehensions, metaprogramming, pointers (other than to raw allocated memory), topic variables, enforced singletons, safe mode, s-expressions, or formal proof construction. The author wryly comments "That should both scare off and attract the right people"

No operator overloading means that '+' always does what you think it should.
No builtin overloading means that min() always does what the manual says, not (sometimes) extract the 36 out of "2:36:15pm".
No function overloading refers to there being at most one (polymorphic) version of a routine in a given scope, not ten from which compiler picks, using convoluted argument type matching rules.
Other entries in that list, in particular eval and interactive programming, may be nice-to-haves and are not necessarily ruled out forever.

Despite, or perhaps because of that, Phix has some 1,277[21] completed rosettacode[33] tasks, second[34] only to Go

Notable failures and criticism

[edit]

If Phix were written in C it would be more portable is a common criticism. The author points to Euphoria and the problems compiling that as a rebuttal.

The inline assembly on which Phix is based is x86/64 only, making (for instance) an ARM port extremely difficult (there are an estimated 24,000 lines of such code).

For many years Phix would consistently segfault (preventing all use) on some versions of Linux but not others.
This is now believed to have been caused by kernels that require 16(as opposed to 8)-byte stack alignment.
A source-level patch is available[35] for 0.8.2 (aka 1.0.2), but 1.0.3 will be the first binary release with that fix pre-applied, and several others[36],[37].

While Phix claims to be an all-in-one installation, and comes with IUP binaries pre-installed on Windows, it is necessary to install IUP and/or GTK on Linux manually before a GUI can be used.
The instructions for installing IUP on Linux in particular are somewhat hidden (demo/pGUI/lnx/installation.txt) and of generally poor quality.

System-wide installation such that Phix can be run from any directory is poorly documented on all systems.

The Edita editor (windows 32-bit only, bundled with Phix), hangs on startup if there are any non-responsive windows. Another attempt to fix this is promised for 1.0.3.

Installation on windows defaults to C:\Program Files (x86)\Phix\ however unless that directory is granted full access permissions (and Phix reinstalled), the VirtualStore aka File and Registry Virtualization feature of windows is known to cause problems.
Those problems would (probably) be solved if the installation process made proper use of %ALLUSERSPROFILE% and %LOCALAPPDATA% as it should.

The natural syntax of say person.age is actually mapped by the compiler front-end to (eg) structs:fetch_field(person, "age").
No similar mapping is performed for dictionaries, which must use eg setd("key", "value", mydict) style syntax, rather than say mydict["key"] := "value".

The extensive run-time checking, which can increase productivity, incurs an inevitable performance penalty (at times a factor of 8), but while eg "without typecheck" helps, it cannot generally be fully disabled.

Subscript performance in particular, while substantially faster than say Python or JavaScript, is substantially slower than say C++ or Go (mainly for the reason just mentioned).

Floating point operations typically incur additional indirection and/or memory allocation compared to other languages (the impact of which is usually less noticeable than subscripts).

The use of inline assembly to alleviate runtime hotspots is both difficult and poorly documented.

The phix debugger does not support reverse or replay debugging.

Many programmers weaned on 0-based indexing will find the transition to 1-based indexing rather painful, at least at first.

Comparable languages

[edit]

Comparison with Euphoria

[edit]

The following differences were present in the first release:

  • Phix has 8-bit strings and variable length slice substitution
  • Phix requires explicit sequence ops such as sq_add() rather than an implicit inline +
  • Euphoria requires equal() and compare() in many cases where Phix allows ==, <, etc.
  • Euphoria can be transpiled to C, Phix cannot
  • Euphoria allows floating-point for loop control variables, Phix does not
  • Euphoria requires explict includes for many things Phix can auto-include
  • The machine_func/machine_proc of Euphoria are mostly deprecated in Phix

Over the last 14 years the implementation of Phix has slowly diverged from Euphoria as follows:

  • Phix has structs and classes, with reference semantics
  • What Euphoria calls maps, Phix calls dictionaries, and uses a different syntax
  • Phix has multithreading, as well as multitasking, whereas Euphoria only has the latter
  • Phix allows negative indexes from right to left, mirroring positive indexes' left to right
  • Phix has named parameters, and does not permit parameter omission with ,, syntax
  • Phix allows the explicit := and == as well as the (implicit) = determined from context
  • Phix allows nested constant declarations via the := operator
  • Euphoria does not support the ~ << >> && || operators
  • Phix allows ; as an entirely optional statement separator
  • Euphoria allows full expressions in parameter defaults whereas Phix is relatively limited
  • Euphoria returns a length of 1 for an atom, whereas Phix (deliberately) crashes
  • Euphoria requires routine_id("name") whereas Phix also allows a bare name
  • Euphoria requires call_func(rid,{...}) whereas Phix also allows rid(...)
  • Phix allows inline variable declaration in desequencing operations
  • Phix has a format directive to specify compilation options, nothing of that ilk in Euphora
  • The ternary operator (iff) in Phix has full short-circuit evaluation, not so in Euphoria
  • Phix allows forward routine declarations and un-named parameters
  • Euphoria supports binary/octal/hex/decimal, Phix supports all number bases 2..36
  • Phix allows static variables and routine-level constants (except on fwd calls)
  • Euphoria supports enum by /N, which Phix does not
  • Euphoria's private/export/global are all simply treated as global by Phix
  • Several include and with/without options are different and/or incompatible
  • Euphoria namespaces can reference the current file, which is broken in Phix
  • Euphoria allows forward references to variables, Phix does not
  • Euphoria allows implicit discard of function results, Phix requires "{} ="
  • Euphoria has with label, break label, and a loop construct, Phix does not
  • Phix has a try/catch/throw, whereas Euphoria has no exception handling
  • Phix allows min(i,j) as well as min({i,j,k,..}), ditto max()
  • Euphoria still has match_from() whereas Phix relies on match() optional start
  • Euphoria yields 97.36 for lower(65.36) whereas Phix yields 65.36
  • Euphoria permits daisy-chaining delete_routine(), Phix is single-shot
  • Euphoria must allocate_string for c_func, Phix can pass strings directly
  • Phix has number as an alias for atom, Euphoria does not
  • Phix supports inline assembly (and sometimes relies on it)
  • No new version of Euphoria has been released since Feb 2015

Despite all these, a fair amount of legacy code still runs happily on both.

References

[edit]
  1. ^ "Javascript + operator". Retrieved 2021-01-09.
  2. ^ "pGUI documentation". Retrieved 2020-12-31.
  3. ^ "IUP". Retrieved 2020-12-31.
  4. ^ "Phix and GTK example". Retrieved 2020-12-31.
  5. ^ "Phix/Gtk listview example". Retrieved 2021-01-16.
  6. ^ "GTK". Retrieved 2020-12-31.
  7. ^ "PCAN/interfaces". Retrieved 2020-12-31.
  8. ^ "database documentation". Retrieved 2020-12-31.
  9. ^ "SQLite". Retrieved 2021-01-14.
  10. ^ "pSQLite documentation". Retrieved 2021-01-14.
  11. ^ "other databases". Retrieved 2020-12-31.
  12. ^ "ipc documentation". Retrieved 2021-01-14.
  13. ^ "json documentation". Retrieved 2021-01-14.
  14. ^ "libcurl documentation". Retrieved 2021-01-14.
  15. ^ "LiteZip documentation". Retrieved 2021-01-14.
  16. ^ "gmp/mpfr documentation". Retrieved 2021-01-14.
  17. ^ "regex documentation". Retrieved 2021-01-14.
  18. ^ "sockets documentation". Retrieved 2021-01-14.
  19. ^ a b "unit test documentation". Retrieved 2021-01-14.
  20. ^ "PCAN". Retrieved 2021-01-16.
  21. ^ a b "Phix on Rosetta Code". Retrieved 2021-01-14.
  22. ^ "Array indexing comparison". Retrieved 2021-01-09.
  23. ^ "Why number should start at zero by Edsger W. Dijkstra". Retrieved 2021-01-09.
  24. ^ "Array datatype index origin". Retrieved 2021-01-09.
  25. ^ "Is Index Origin 0 a Hindrance? Roger Hui". Retrieved 2021-01-09.
  26. ^ "Thread on Julia google groups". Retrieved 2021-01-09.
  27. ^ "Again on 0-based vs. 1-based indexing". Retrieved 2021-01-20.
  28. ^ a b "Lua, a misunderstood language". Retrieved 2021-01-20.
  29. ^ "Indexing of Arrays: 0 vs 1". Retrieved 2021-01-17.
  30. ^ "What's the big deal? 0 vs 1 based indexing". Retrieved 2021-01-14.
  31. ^ "Java is Pass-by-Value, Dammit!". Retrieved 2021-01-17.
  32. ^ "Give me 15 minutes & I'll change your view of GDB". Retrieved 2021-01-17.
  33. ^ "Rosetta Code". Retrieved 2021-01-14.
  34. ^ "Rosetta Code/Count examples/Full list". Retrieved 2021-01-14.
  35. ^ "Segfault on Linux". Retrieved 2021-01-14.
  36. ^ "Trouble with double". Retrieved 2021-01-14.
  37. ^ "Phix not terminating on Linux". Retrieved 2021-01-14.
[edit]

Languages implemented in Phix

[edit]


Category:Procedural programming languages Category:Cross-platform software Category:Programming languages created in 2006 Category:Free educational software