Self-Referential Language?

CraigD · April 24, 2009

…he took the breakdown structure (BDS) of an existing language -- a very simple one -- and wrote a small assembly program that read in the BDS as input -- and then it would read in any program written in that higher level language, and reduce it to assembler.

I don't know what they call a BDS nowdays, but it looked something like this:
STATEMENT :== {ARITH_STATEMENT|IF_STATEMENT|LOOP_STATEMENT|begin|stop}
...

Generically, expressions like these are in what are called metalanguage. There are a couple of somewhat standard ones, such as http://Backus-Naur Form (which has expressions like <name> ::= <last-name> “,” <first-name> “ ” <middle-initial>)

and Extended Backus–Naur Form (name = last name , “,” , first name , [ middle initial ], Augmented BNF, etc., but in documents such as ANSI language standards, the metalanguage used is usually explicitly defined in the beginning of the document, and various hard-to-machine-read typesetting features, such as underlining and vertical bracketing, used. RFC2616, which defines the HTTP this page is using, uses ABNF.

Because a language I once translated many megabytes of source code from one implementation of to another had an ANSI standard document using a particular BNF variant, I wrote a parser for it to identify the non-portable parts. It didn’t interpret or compile the language, like Pyro’s friend’s did, just identified every substring within all of the source code by the symbol names given in the language specification, allowing much of the translation to be automatic, and identifying the parts that were structurally complicated to be rewritten by a small team of humans.

My experience and impression is that, since the late 1990s, language implementers have been less concerned with using formal meta-languages, and in many cases adopted an “the interpreter/compiler is the meta-language definition of the language” attitude, rather than making large efforts to write and conform with standards maintained by organizations like ANSI, the WWW consortium being the major exception to this trend.

Buffy · April 24, 2009

Most parser tools use some variation of BNF as input, the most famous--which some of us still use--is YACC ("yet another compiler compiler") which is normally paired with Lex, and both are available in open source versions like GNU's Flex and Bison. YACC is an LALR(1) parser though, which means you normally run into all sorts of nasty shift-reduce conflicts, and the more commonly used parsers today are all Recursive-Descent parsers, which are much more robust. I can recommend the Gold Parser which is free/open source and has oodles of output languages for implementation.

Parsers are indispensable if you're going to be a real programmer.

This of course is not really what KAC is looking for though, which is a language that would subsume an extensible parser as part of it's core functionality....

Mathematicians are like Frenchmen: whatever you say to them they translate into their own language, and forthwith it is something entirely different, :)

Buffy

IDMclean · April 24, 2009

KAC, i'm with you on Lisp, though i think i would do Haskell over Smalltalk; Haskell has had it's day view in human behavioral simulation recently that came out in a game, that came out to be so cool it was and is still being studied by researchers at various universities. As i think behavioral simulation is something you are interested in (from what i think i know about you) that may be a language worth learning and putting higher up then Smalltalk, though i think you should battle with Lisp first :)

Reminds me of something that happened long ago at Texas Instruments.

[...]

Our lead programmer came up with something novel -- he took the breakdown structure (BDS) of an existing language -- a very simple one -- and wrote a small assembly program that read in the BDS as input -- and then it would read in any program written in that higher level language, and reduce it to assembler.

[...]

We got to playing with this, writing little routines in our new hi-level language (HLL) until somebody realized that the BDS text was in a buffer in the computer while executing the HLL. It was trivial then to make a small adjustment in the BDS.

[...]

This meant that in mid-program, we could change the rules of the language and use the new rules immediately.
Eventually, we got it so we could add or delete options from any given line of BSD without having to redefine the whole line. We did this by defining a few new lines of BSD which gave us operators for selecting a specific option within an existing BSD line, and for searching the BSD line by line, option by option.

We had a LOT of fun doing this, but it didn't come to much. We didn't have the time to spend all day experimenting with it, and producing something that wouldn't crash the whole computer was a really big challenge.

But the possibilities...

Generically, expressions like these are in what are called metalanguage. There are a couple of somewhat standard ones, such as http://Backus-Naur Form (which has expressions like <name> ::= <last-name> “,” <first-name> “ ” <middle-initial>)
and Extended Backus–Naur Form (name = last name , “,” , first name , [ middle initial ], Augmented BNF, etc., but in documents such as ANSI language standards, the metalanguage used is usually explicitly defined in the beginning of the document, and various hard-to-machine-read typesetting features, such as underlining and vertical bracketing, used. RFC2616, which defines the HTTP this page is using, uses ABNF.

[...]

My experience and impression is that, since the late 1990s, language implementers have been less concerned with using formal meta-languages, and in many cases adopted an “the interpreter/compiler is the meta-language definition of the language” attitude, rather than making large efforts to write and conform with standards maintained by organizations like ANSI, the WWW consortium being the major exception to this trend.

[...]
I can recommend the Gold Parser which is free/open source and has oodles of output languages for implementation.

Parsers are indispensable if you're going to be a real programmer.

This of course is not really what KAC is looking for though, which is a language that would subsume an extensible parser as part of it's core functionality....
[...]

O, ye wise ones!

I harken unto thee, for perchance ye all may well advise me in the art, science, and crafts of languages of computable truths.

Where would one be well advised to begin in the development of a language such as this? I would think that if I am to use XML as the declarative means of the language, I would be well advised to describe the language according to the strictures of outline suggested in the examination of the problem posed. I would expect--naively perhaps--developing a compiler for the language would be next in the order of things, but I also expect--from experience--that there are other ways to go about developing this language.

Also, Does anyone have links to resources related to writing languages? Tutorials, examples, compiler/interpreter/profiler examinations, and the likes? Do I need to learn assembly and/or C to write my own language?

While I wait, I am going to sketch my concept in XML.

alexander · April 24, 2009

you dont need to know assembly or c to write your language, you can write it in any language you want, generally if it's a compiled language then the compiler is written to smartly translate the language into assembly, and then it is assembled and compiled again into a binary for a particular operating system...

Also most interpreters, for the sheer need of speed of execution, are written in C (more oftenly C++), but as Buffy could tell you, you can write a C compiler in Lisp, nothing's stopping you. Mind you most C compilers are written in C ( yes, go figure that one :) ) also most (well i only know of one, and a shell) C interpreters are written in C.

Point being, no, you dont need to know a low level language to make your own programming language. Infact a well-defined language would tend to attract programmers that will code it in a low-level language for you. There are a few books available on language design

David A. Watt - "Programming Language Design Concepts"

Raphael Finkel - "Advanced Programming Language Design"

Terrence W. Pratt and Marvin V. Zelkowitz - "Programming Languages: Design and Implementation"

Bruce J. MacLennan - "Principles of Programming Languages"

At least a couple of those above should be augmented with "Programming Language Pragmatics Second Edition" by Michael L. Scott

Also these three books in succession (they get progressively more hardcore, and they all talk about modern compiler techniques, loop optimization, SSA, all kinds of good stuff):

Kenneth C. Louden - "Compiler Construction: Principles and Practice" - (this is more of a course book for students)

Andrew W. Appel - "Modern Compiler Implementation in C" - (this is a book that those students read to ace the course and familiarize themselves with up-to-date practices)

Steven Muchnick - "Advanced Compiler Design and Implementation" - (this is the book for the students that can teach the course and have the professor learn things he's probably never heard before)

But also remember, if you go with my advise and get 5 books (one out of the first set, the pragmatics book and the last 3 on compiler design) its nearly a 500 dollar bill, books on advanced topics are not cheap... But my point is, programming language design is above the range of a tutorial, a good programming language takes knowledge and practice, psychology and imagination, a good knowledge of current languages, but not a drive to compete with any of them. Only then can you desing a language that stands out on its own...

alexander · April 24, 2009

Also remember that most of the greatest programming languages were written for the designer and his friends, most bad languages were also written for that purpose, the best language IMHO, Python, was written with a specific mission in mind, but there are plenty of really bad languages out there that were written for that purpose too, so design it to use for yourself and your friends with a mission, hopefully you'll do better then the thousands of horrible, failed languages out there :)

Pyrotex · April 24, 2009

Yes, that was it: BNF. Backus Naur Format. Thanks Craig.

KAC: The logical start IMHO would be defining the starting point BNF in XML, then writing an Interpreter in Python that used the BNF explicitly as the definition of the language itself. Now, for the starter language, I would begin with a simple subset of Python!!! I would only add one teensy feature to this subset of Python, something like a keyword, say "BNF(n,m)", where n is the nth statement in your BNF, and m is the mth option in that statement. (The options are the text between the vertical bars |.) The type of BNF(n,m) would be a string, of course.

TheFaithfulStone · April 24, 2009

Also, Does anyone have links to resources related to writing languages? Tutorials, examples, compiler/interpreter/profiler examinations, and the likes? Do I need to learn assembly and/or C to write my own language?

Not at all. I've written several DSLs in Python using ANTLR as a parser generator.

It's not really what I'd call fun but it's not too bad.

I also wrote one in Perl and generated the tokenizer and parser by hand - and THAT was my own special version of hell.

tfs

alexander · April 24, 2009

perl is the perfect langauge for writing a parser actually, cuz the one thing it does really well is work with text :)

IDMclean · April 25, 2009

It seems I maybe moving onto learning a new notation set, vW-Grammar.

Alright, I've determined that I we need to discuss Abstract Syntax (AST), and Parsing Trees. Also, I feel we might need to discuss abstract semantic graphs. I'm aware that these structures can be utilized to facilitate reflection within a language, but I'm unaware of how one would do that.

Does Java use it's AST to provided reflective capabilities?

IDMclean · May 19, 2009

Alright, so I've been spending a lot of time lurking around the Semantic Web Interest Group Channel on the Freenode IRC network. We've been talking about defining an ontology for describing programming languages in a data-language format. one of the people there, Toby Inkster, has gone and started an implementation which is available.

core terms

one particular view of how programming languages relate together

one particular view of how data languaes fit together.

This schema is purposed for use as the declarative portion of a Language of Languages.

The basic idea which we arrived on is that language is practically impossible to define manually, but we have various implemented languages that we can parse programmatically to build towards an implicit definition of language.

We arrived at this conclusion based on the fact that a full vW-grammar is Turing complete and therefore undecidable as a metalanguage. It will result in an infinite number of languages. If a universal language exists, it would have to express the set of all expressible languages, so any language known of would likely be a subset of a Language of Languages.

enjoy the links,

Ian Mclean

IDMclean · March 28, 2010

Been forever and a day since I last posted here.

I've been reading research and books about logic, metalogic, mathematics, metamathematics, automata theory, computating theory, quantum computation, and more.

For anyone interested, here's my research list:

Computing Theory:

The Mathematical Universe by Max Tegmark

On Generalized Quantum Turing Machine and its Language Classes by Satoshi Iriyama and Masanori Ohya

Quantum Theory, the Church-Turing Principle, and the Universal Quantum Computer by David Deutsch

Quantum Automata, Machines, and Complexity by Anuj Dawar

A New Computational Model: the Quantum Chaotic Turing Machine by Matthew Mains

Discrete, Finite, and Holographic Spacetime:

The Holographic Principle by Raphael Bousso

Spacetime at the Planck Scale: The Quantum Computer View by Paola Zizzi

Is Hilbert Space Discrete? by Roman V. Buniy, Stephen D. H. Hsu, and A. Zee

Leptons and Quarks in a Discrete Spacetime by Franklin Potter

Unification of Interactions in Discrete Spacetime by Franklin Potter

Foundations of Multidimensional Wavelet Theory: The Quarternion Fourier Transform and its Generalizations by Eckhard Hitzer

A New Family Symmetry: Discrete Quaternion Group by M. Frigerio

Of greatest interest and relevance to this thread:

Turning the Liar paradox into a metatheorem of Basic logic

Basic Logic and Quantum Entanglement

Paola Zizzi's Basic Logic is a context-free logic, so would that mean it could be used to construct a 1-level W-Grammar? Also, how do I test to see if a given logic, math, grammar, or language is regular? How would I reduce Paola Zizzi's Basic Logic to a Regular Logic?

alexander · March 29, 2010

Welcome back, I've just been working a lot, but its fun stuff sometimes, i do a ton of coding now, i mean an average day will have me looking at bash, php, perl, java and ruby code, sometimes laced with some C, JavaScript and stuff like that.

Well time to go back and continue reverse engineering this interface. Need to write some code that forks some database-driven threaded code that communicates to interfaces across the net... one thing comes to mind... BUFFFYYY (shh, seeing if she can materialize here like Marry Poppins with some magic for me)... waaaiting... waaaiting... oh never mind, i guess i'll have to write something crazy again... crazier then this: self.gsub(/[^[:alnum:]]/,'').slice(0..2).gsub(/[a-c,A-C]/, "2").gsub(/[d-f,D-F]/, "3").gsub(/[g-i,G-I]/, "4").gsub(/[j-l,J-L]/, "5").gsub(/[m-o,M-O]/, "6").gsub(/[p-s,P-S]/, "7").gsub(/[t-v,T-V]/, "8").gsub(/[w-z,W-Z]/, "9")

Pyrotex · March 31, 2010

This just occurred to me: the father of all self-referential languages. integer arithmetic.

Gödel is best known for his two incompleteness theorems, published in 1931 when he was 25 years of age, one year after finishing his doctorate at the University of Vienna. The more famous incompleteness theorem states that for any self-consistent recursive axiomatic system powerful enough to describe the arithmetic of the natural numbers (Peano arithmetic), there are true propositions about the naturals that cannot be proved from the axioms. To prove this theorem, Gödel developed a technique now known as Gödel numbering, which codes formal expressions as natural numbers.

And this question occurred to me: What do you want to do with this self-referential language? Do you want to use it to modify itself? Or just to answer questions about itself?

IDMclean · April 1, 2010

Thanks Pyrotex, it should have occured to me that Godel's numbering was effectively what we've been discussing.

To answer your question, this would be a language that writes itself. Though, I do want to build a metalanguage for writing domain-specific languages.

Of note for this, there's been work done on self-verifying theories by Dan Willard. With that in mind, a language could be developed for answering questions about language.

Sign In

Self-Referential Language?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation