Squashed 'xonsh/ply/' changes from 393cc558..0f398b72

0f398b72 Minor cleanup of support files
28575bd8 Merge pull request #152 from astrofrog/fix-whitespace
17a726c4 Fixed token order
e8c6af0a Merge branch 'master' of https://github.com/dabeaz/ply
5d544408 Fixed issue #148
807d3816 Merge pull request #153 from astrofrog/fix-reflags-python3
e678d756 Force reflags to be converted to an integer on Python 3
763acec6 Added regression test for bug in Python 3 with reflags and optimize=True (in Python 3, re module flags are RegexFlags instances, not integers)
3e3b45f8 Remove trailing whitespace
6860652b Merge pull request #135 from laerreal/bugfixes
149a11c5 Merge pull request #131 from hugovk/patch-1
1f9775de Merge pull request #141 from segevfiner/fix-tabmodule-class-in-package
e5925a4d Merge pull request #125 from ignamv/pylint_disable
27b327ee Merge pull request #128 from gvalkov/master
7f9b69fe Merge pull request #127 from psihonavt/patch-1
eed7b381 Merge pull request #139 from segevfiner/fix-find-column
ebf2d286 Calculate the correct tabmodule for parsers defined in a class inside a package
3d7c860b The find_column example returns the column off by +1 for every line but the first.
069239bf test: update README
5a74b95c cpp: check token list bounds during macro expansion
e71a4a04 test: add example of IndexError during expansion of a parametrized macro
12f4bd55 cpp: avoid infinite attempts to expand a word same as a parameterized macro
40bec356 test: add example that leads preprocessor to a dead loop
0f874685 cpp: fixup removal of '##' around macro argument during concatenation
9fdca0bf test: add example of incorrect expansion of concatenation (##) in macro
1fd7e122 test: add a framework for C preprocessor testing
7f4a6cc3 Add title formatting and build badge
15d42d9d Merge pull request #130 from Carreau/docs-id
1fd90675 Fix a couple of duplicated ids in the docs.
a986c8d6 Simple makefile for common tasks
2ef24ded Update yacc.py
d110a058 Add pylint command to disable warnings on generated parsetab.py
cbef61c5 Merge pull request #124 from boriel/fix_parsetab_signature_py3
eb7e15ac Fix issue #31 on python3
b791b089 Bump version
43fe6fc7 Merge pull request #119 from divergentdave/clear-modules-cache
2da92886 Merge pull request #116 from alberth/flakes_fixes
db00266e Merge pull request #118 from divergentdave/patch-1
d59e139f Merge pull request #114 from JackDanger/jackdanger/fix-version-bump
5293c7be Clear table modules from cache when overwriting
877b0ec6 Fix typo in comment
a813d9d7 Fixes for problems reported by pyflakes3
2e21642c bump remaining file to 3.10
d4c86648 Merge pull request #112 from brettcannon/patch-2
c1f07cc1 Merge pull request #111 from brettcannon/patch-1
ac7d4a83 Add some missing spaces in a code example
c3e1c9b9 Fix a grammar mistake in the docs
031fb0ee Fixed issue #110
d951379d Updated dates. Release info
2ba2f31e Merge branch 'master' of https://github.com/dabeaz/ply
3335be29 Reworked signature code to not use digests or hashes.
11eb4cf0 Merge pull request #107 from oboroc/master
69de7b84 Minor patch up for re flags
b659dab5 Merge pull request #102 from ignamv/ignamv-noverbose
a35469c0 Fixed issue #103
c265bbb0 Switch print to print() as per PEP 3105
fc7b81d9 Make re.VERBOSE flag optional

git-subtree-dir: xonsh/ply
git-subtree-split: 0f398b72618c1564d71f7dc0558e6722b241875a
This commit is contained in:
Gil Forsyth 2018-11-02 14:46:29 -04:00
parent ab7536aa4d
commit 6bfa551f69
21 changed files with 315 additions and 146 deletions

View file

@ -2,9 +2,10 @@ language: python
python:
- "2.6"
- "2.7"
- "3.2"
- "3.3"
- "3.4"
- "3.5"
- "3.6"
install:
- "pip install . "
script: "cd test && python testlex.py && python testyacc.py"

View file

@ -1,11 +1,11 @@
October 7, 2016
February 15, 2018
Announcing : PLY-3.10 (Python Lex-Yacc)
Announcing : PLY-3.11 (Python Lex-Yacc)
http://www.dabeaz.com/ply
I'm pleased to announce PLY-3.10--a pure Python implementation of the
common parsing tools lex and yacc. PLY-3.10 is a minor bug fix
I'm pleased to announce PLY-3.11--a pure Python implementation of the
common parsing tools lex and yacc. PLY-3.11 is a minor bug fix
release. It supports both Python 2 and Python 3.
If you are new to PLY, here are a few highlights:

17
CHANGES
View file

@ -1,5 +1,22 @@
Version 3.11
---------------------
02/15/18 beazley
Fixed some minor bugs related to re flags and token order.
Github pull requests #151 and #153.
02/15/18 beazley
Added a set_lexpos() method to grammar symbols. Github issue #148.
04/13/17 beazley
Mostly minor bug fixes and small code cleanups.
Version 3.10
---------------------
01/31/17: beazley
Changed grammar signature computation to not involve hashing
functions. Parts are just combined into a big string.
10/07/16: beazley
Fixed Issue #101: Incorrect shift-reduce conflict resolution with
precedence specifier.

17
Makefile Normal file
View file

@ -0,0 +1,17 @@
PYTHON ?= python
test:
cd test && $(PYTHON) testlex.py
cd test && $(PYTHON) testyacc.py
wheel:
$(PYTHON) setup.py bdist_wheel
sdist:
$(PYTHON) setup.py sdist
upload: wheel sdist
$(PYTHON) setup.py bdist_wheel upload
$(PYTHON) setup.py sdist upload
.PHONY: test wheel sdist upload

View file

@ -1,6 +1,8 @@
PLY (Python Lex-Yacc) Version 3.10
# PLY (Python Lex-Yacc) Version 3.11
Copyright (C) 2001-2016
[![Build Status](https://travis-ci.org/dabeaz/ply.svg?branch=master)](https://travis-ci.org/dabeaz/ply)
Copyright (C) 2001-2018
David M. Beazley (Dabeaz LLC)
All rights reserved.

View file

@ -12,7 +12,7 @@ dave@dabeaz.com<br>
</b>
<p>
<b>PLY Version: 3.10</b>
<b>PLY Version: 3.11</b>
<p>
<!-- INDEX -->

View file

@ -12,13 +12,13 @@ dave@dabeaz.com<br>
</b>
<p>
<b>PLY Version: 3.10</b>
<b>PLY Version: 3.11</b>
<p>
<!-- INDEX -->
<div class="sectiontoc">
<ul>
<li><a href="#ply_nn1">Preface and Requirements</a>
<li><a href="#ply_nn0">Preface and Requirements</a>
<li><a href="#ply_nn1">Introduction</a>
<li><a href="#ply_nn2">PLY Overview</a>
<li><a href="#ply_nn3">Lex</a>
@ -34,7 +34,7 @@ dave@dabeaz.com<br>
<li><a href="#ply_nn12">Error handling</a>
<li><a href="#ply_nn14">EOF Handling</a>
<li><a href="#ply_nn13">Building and using the lexer</a>
<li><a href="#ply_nn14">The @TOKEN decorator</a>
<li><a href="#ply_nn14b">The @TOKEN decorator</a>
<li><a href="#ply_nn15">Optimized mode</a>
<li><a href="#ply_nn16">Debugging</a>
<li><a href="#ply_nn17">Alternative specification of lexers</a>
@ -42,7 +42,7 @@ dave@dabeaz.com<br>
<li><a href="#ply_nn19">Lexer cloning</a>
<li><a href="#ply_nn20">Internal lexer state</a>
<li><a href="#ply_nn21">Conditional lexing and start conditions</a>
<li><a href="#ply_nn21">Miscellaneous Issues</a>
<li><a href="#ply_nn21b">Miscellaneous Issues</a>
</ul>
<li><a href="#ply_nn22">Parsing basics</a>
<li><a href="#ply_nn23">Yacc</a>
@ -50,10 +50,10 @@ dave@dabeaz.com<br>
<li><a href="#ply_nn24">An example</a>
<li><a href="#ply_nn25">Combining Grammar Rule Functions</a>
<li><a href="#ply_nn26">Character Literals</a>
<li><a href="#ply_nn26">Empty Productions</a>
<li><a href="#ply_nn26b">Empty Productions</a>
<li><a href="#ply_nn28">Changing the starting symbol</a>
<li><a href="#ply_nn27">Dealing With Ambiguous Grammars</a>
<li><a href="#ply_nn28">The parser.out file</a>
<li><a href="#ply_nn28b">The parser.out file</a>
<li><a href="#ply_nn29">Syntax Error Handling</a>
<ul>
<li><a href="#ply_nn30">Recovery and resynchronization with error rules</a>
@ -64,11 +64,11 @@ dave@dabeaz.com<br>
</ul>
<li><a href="#ply_nn33">Line Number and Position Tracking</a>
<li><a href="#ply_nn34">AST Construction</a>
<li><a href="#ply_nn35">Embedded Actions</a>
<li><a href="#ply_nn35b">Embedded Actions</a>
<li><a href="#ply_nn36">Miscellaneous Yacc Notes</a>
</ul>
<li><a href="#ply_nn37">Multiple Parsers and Lexers</a>
<li><a href="#ply_nn38">Using Python's Optimized Mode</a>
<li><a href="#ply_nn38b">Using Python's Optimized Mode</a>
<li><a href="#ply_nn44">Advanced Debugging</a>
<ul>
<li><a href="#ply_nn45">Debugging the lex() and yacc() commands</a>
@ -85,7 +85,7 @@ dave@dabeaz.com<br>
<H2><a name="ply_nn1"></a>1. Preface and Requirements</H2>
<H2><a name="ply_nn0"></a>1. Preface and Requirements</H2>
<p>
@ -552,21 +552,18 @@ Within the rule, the <tt>lineno</tt> attribute of the underlying lexer <tt>t.lex
After the line number is updated, the token is simply discarded since nothing is returned.
<p>
<tt>lex.py</tt> does not perform and kind of automatic column tracking. However, it does record positional
<tt>lex.py</tt> does not perform any kind of automatic column tracking. However, it does record positional
information related to each token in the <tt>lexpos</tt> attribute. Using this, it is usually possible to compute
column information as a separate step. For instance, just count backwards until you reach a newline.
<blockquote>
<pre>
# Compute column.
# Compute column.
# input is the input text string
# token is a token instance
def find_column(input,token):
last_cr = input.rfind('\n',0,token.lexpos)
if last_cr < 0:
last_cr = 0
column = (token.lexpos - last_cr) + 1
return column
def find_column(input, token):
line_start = input.rfind('\n', 0, token.lexpos) + 1
return (token.lexpos - line_start) + 1
</pre>
</blockquote>
@ -718,7 +715,7 @@ be used to control the lexer.
None if the end of the input text has been reached.
</ul>
<H3><a name="ply_nn14"></a>4.12 The @TOKEN decorator</H3>
<H3><a name="ply_nn14b"></a>4.12 The @TOKEN decorator</H3>
In some applications, you may want to define build tokens from as a series of
@ -1418,7 +1415,7 @@ However, if the closing right brace is encountered, the rule <tt>t_ccode_rbrace<
position), stores it, and returns a token 'CCODE' containing all of that text. When returning the token, the lexing state is restored back to its
initial state.
<H3><a name="ply_nn21"></a>4.20 Miscellaneous Issues</H3>
<H3><a name="ply_nn21b"></a>4.20 Miscellaneous Issues</H3>
<P>
@ -1438,10 +1435,13 @@ well as for input text.
<blockquote>
<pre>
lex.lex(reflags=re.UNICODE)
lex.lex(reflags=re.UNICODE | re.VERBOSE)
</pre>
</blockquote>
Note: by default, <tt>reflags</tt> is set to <tt>re.VERBOSE</tt>. If you provide
your own flags, you may need to include this for PLY to preserve its normal behavior.
<p>
<li>Since the lexer is written entirely in Python, its performance is
largely determined by that of the Python <tt>re</tt> module. Although
@ -1890,7 +1890,7 @@ literals = ['+','-','*','/' ]
<b>Character literals are limited to a single character</b>. Thus, it is not legal to specify literals such as <tt>'&lt;='</tt> or <tt>'=='</tt>. For this, use
the normal lexing rules (e.g., define a rule such as <tt>t_EQ = r'=='</tt>).
<H3><a name="ply_nn26"></a>6.4 Empty Productions</H3>
<H3><a name="ply_nn26b"></a>6.4 Empty Productions</H3>
<tt>yacc.py</tt> can handle empty productions by defining a rule like this:
@ -2208,7 +2208,7 @@ the contents of the
<tt>parser.out</tt> debugging file with an appropriately high level of
caffeination.
<H3><a name="ply_nn28"></a>6.7 The parser.out file</H3>
<H3><a name="ply_nn28b"></a>6.7 The parser.out file</H3>
Tracking down shift/reduce and reduce/reduce conflicts is one of the finer pleasures of using an LR
@ -2950,7 +2950,7 @@ def p_expression_binop(p):
</pre>
</blockquote>
<H3><a name="ply_nn35"></a>6.11 Embedded Actions</H3>
<H3><a name="ply_nn35b"></a>6.11 Embedded Actions</H3>
The parsing technique used by yacc only allows actions to be executed at the end of a rule. For example,
@ -3140,7 +3140,7 @@ each time it runs (which may take awhile depending on how large your grammar is)
<blockquote>
<pre>
parser = yacc.parse(debug=True)
parser.parse(input_text, debug=True)
</pre>
</blockquote>
@ -3270,7 +3270,7 @@ If necessary, arbitrary attributes can be attached to the lexer or parser object
For example, if you wanted to have different parsing modes, you could attach a mode
attribute to the parser object and look at it later.
<H2><a name="ply_nn38"></a>8. Using Python's Optimized Mode</H2>
<H2><a name="ply_nn38b"></a>8. Using Python's Optimized Mode</H2>
Because PLY uses information from doc-strings, parsing and lexing

View file

@ -109,8 +109,8 @@ def t_code_error(t):
def t_error(t):
print "%d: Illegal character '%s'" % (t.lexer.lineno, t.value[0])
print t.value
print("%d: Illegal character '%s'" % (t.lexer.lineno, t.value[0]))
print(t.value)
t.lexer.skip(1)
lex.lex()

View file

@ -22,18 +22,18 @@ def p_defsection(p):
'''defsection : definitions SECTION
| SECTION'''
p.lexer.lastsection = 1
print "tokens = ", repr(tokenlist)
print
print "precedence = ", repr(preclist)
print
print "# -------------- RULES ----------------"
print
print("tokens = ", repr(tokenlist))
print()
print("precedence = ", repr(preclist))
print()
print("# -------------- RULES ----------------")
print()
def p_rulesection(p):
'''rulesection : rules SECTION'''
print "# -------------- RULES END ----------------"
print("# -------------- RULES END ----------------")
print_code(p[2], 0)
@ -49,7 +49,7 @@ def p_definition_literal(p):
def p_definition_start(p):
'''definition : START ID'''
print "start = '%s'" % p[2]
print("start = '%s'" % p[2])
def p_definition_token(p):
@ -138,7 +138,7 @@ def p_rules(p):
rulecount = 1
for r in rule[1]:
# r contains one of the rule possibilities
print "def p_%s_%d(p):" % (rulename, rulecount)
print("def p_%s_%d(p):" % (rulename, rulecount))
prod = []
prodcode = ""
for i in range(len(r)):
@ -155,17 +155,17 @@ def p_rules(p):
embed_count += 1
else:
prod.append(item)
print " '''%s : %s'''" % (rulename, " ".join(prod))
print(" '''%s : %s'''" % (rulename, " ".join(prod)))
# Emit code
print_code(prodcode, 4)
print
print()
rulecount += 1
for e, code in embedded:
print "def p_%s(p):" % e
print " '''%s : '''" % e
print("def p_%s(p):" % e)
print(" '''%s : '''" % e)
print_code(code, 4)
print
print()
def p_rule(p):
@ -204,7 +204,7 @@ def p_morerules(p):
p[0] = p[1]
p[0].append(p[3])
# print "morerules", len(p), p[0]
# print("morerules", len(p), p[0])
def p_rulelist(p):
@ -241,4 +241,4 @@ def print_code(code, indent):
return
codelines = code.splitlines()
for c in codelines:
print "%s# %s" % (" " * indent, c)
print("%s# %s" % (" " * indent, c))

View file

@ -29,14 +29,14 @@ import yparse
from ply import *
if len(sys.argv) == 1:
print "usage : yply.py [-nocode] inputfile"
print("usage : yply.py [-nocode] inputfile")
raise SystemExit
if len(sys.argv) == 3:
if sys.argv[1] == '-nocode':
yparse.emit_code = 0
else:
print "Unknown option '%s'" % sys.argv[1]
print("Unknown option '%s'" % sys.argv[1])
raise SystemExit
filename = sys.argv[2]
else:
@ -44,8 +44,8 @@ else:
yacc.parse(open(filename).read())
print """
print("""
if __name__ == '__main__':
from ply import *
yacc.yacc()
"""
""")

View file

@ -1,5 +1,5 @@
# PLY package
# Author: David Beazley (dave@dabeaz.com)
__version__ = '3.9'
__version__ = '3.11'
__all__ = ['lex','yacc']

View file

@ -5,7 +5,7 @@
# Copyright (C) 2007
# All rights reserved
#
# This module implements an ANSI-C style lexical preprocessor for PLY.
# This module implements an ANSI-C style lexical preprocessor for PLY.
# -----------------------------------------------------------------------------
from __future__ import generators
@ -77,7 +77,8 @@ def t_CPP_COMMENT2(t):
r'(//.*?(\n|$))'
# replace with '/n'
t.type = 'CPP_WS'; t.value = '\n'
return t
def t_error(t):
t.type = t.value[0]
t.value = t.value[0]
@ -91,8 +92,8 @@ import os.path
# -----------------------------------------------------------------------------
# trigraph()
#
# Given an input string, this function replaces all trigraph sequences.
#
# Given an input string, this function replaces all trigraph sequences.
# The following mapping is used:
#
# ??= #
@ -262,7 +263,7 @@ class Preprocessor(object):
# ----------------------------------------------------------------------
# add_path()
#
# Adds a search path to the preprocessor.
# Adds a search path to the preprocessor.
# ----------------------------------------------------------------------
def add_path(self,path):
@ -306,7 +307,7 @@ class Preprocessor(object):
# ----------------------------------------------------------------------
# tokenstrip()
#
#
# Remove leading/trailing whitespace tokens from a token list
# ----------------------------------------------------------------------
@ -332,7 +333,7 @@ class Preprocessor(object):
# argument. Each argument is represented by a list of tokens.
#
# When collecting arguments, leading and trailing whitespace is removed
# from each argument.
# from each argument.
#
# This function properly handles nested parenthesis and commas---these do not
# define new arguments.
@ -344,7 +345,7 @@ class Preprocessor(object):
current_arg = []
nesting = 1
tokenlen = len(tokenlist)
# Search for the opening '('.
i = 0
while (i < tokenlen) and (tokenlist[i].type in self.t_WS):
@ -378,7 +379,7 @@ class Preprocessor(object):
else:
current_arg.append(t)
i += 1
# Missing end argument
self.error(self.source,tokenlist[-1].lineno,"Missing ')' in macro arguments")
return 0, [],[]
@ -390,9 +391,9 @@ class Preprocessor(object):
# This is used to speed up macro expansion later on---we'll know
# right away where to apply patches to the value to form the expansion
# ----------------------------------------------------------------------
def macro_prescan(self,macro):
macro.patch = [] # Standard macro arguments
macro.patch = [] # Standard macro arguments
macro.str_patch = [] # String conversion expansion
macro.var_comma_patch = [] # Variadic macro comma patch
i = 0
@ -410,10 +411,11 @@ class Preprocessor(object):
elif (i > 0 and macro.value[i-1].value == '##'):
macro.patch.append(('c',argnum,i-1))
del macro.value[i-1]
i -= 1
continue
elif ((i+1) < len(macro.value) and macro.value[i+1].value == '##'):
macro.patch.append(('c',argnum,i))
i += 1
del macro.value[i + 1]
continue
# Standard expansion
else:
@ -439,7 +441,7 @@ class Preprocessor(object):
rep = [copy.copy(_x) for _x in macro.value]
# Make string expansion patches. These do not alter the length of the replacement sequence
str_expansion = {}
for argnum, i in macro.str_patch:
if argnum not in str_expansion:
@ -457,7 +459,7 @@ class Preprocessor(object):
# Make all other patches. The order of these matters. It is assumed that the patch list
# has been sorted in reverse order of patch location since replacements will cause the
# size of the replacement sequence to expand from the patch point.
expanded = { }
for ptype, argnum, i in macro.patch:
# Concatenation. Argument is left unexpanded
@ -494,7 +496,7 @@ class Preprocessor(object):
if t.value in self.macros and t.value not in expanded:
# Yes, we found a macro match
expanded[t.value] = True
m = self.macros[t.value]
if not m.arglist:
# A simple macro
@ -508,7 +510,7 @@ class Preprocessor(object):
j = i + 1
while j < len(tokens) and tokens[j].type in self.t_WS:
j += 1
if tokens[j].value == '(':
if j < len(tokens) and tokens[j].value == '(':
tokcount,args,positions = self.collect_args(tokens[j:])
if not m.variadic and len(args) != len(m.arglist):
self.error(self.source,t.lineno,"Macro %s requires %d arguments" % (t.value,len(m.arglist)))
@ -526,7 +528,7 @@ class Preprocessor(object):
else:
args[len(m.arglist)-1] = tokens[j+positions[len(m.arglist)-1]:j+tokcount-1]
del args[len(m.arglist):]
# Get macro replacement text
rep = self.macro_expand_args(m,args)
rep = self.expand_macros(rep,expanded)
@ -534,18 +536,24 @@ class Preprocessor(object):
r.lineno = t.lineno
tokens[i:j+tokcount] = rep
i += len(rep)
else:
# This is not a macro. It is just a word which
# equals to name of the macro. Hence, go to the
# next token.
i += 1
del expanded[t.value]
continue
elif t.value == '__LINE__':
t.type = self.t_INTEGER
t.value = self.t_INTEGER_TYPE(t.lineno)
i += 1
return tokens
# ----------------------------------------------------------------------
# ----------------------------------------------------------------------
# evalexpr()
#
#
# Evaluate an expression token sequence for the purposes of evaluating
# integral expressions.
# ----------------------------------------------------------------------
@ -592,7 +600,7 @@ class Preprocessor(object):
tokens[i].value = str(tokens[i].value)
while tokens[i].value[-1] not in "0123456789abcdefABCDEF":
tokens[i].value = tokens[i].value[:-1]
expr = "".join([str(x.value) for x in tokens])
expr = expr.replace("&&"," and ")
expr = expr.replace("||"," or ")
@ -617,7 +625,7 @@ class Preprocessor(object):
if not source:
source = ""
self.define("__FILE__ \"%s\"" % source)
self.source = source
@ -636,7 +644,7 @@ class Preprocessor(object):
for tok in x:
if tok.type in self.t_WS and '\n' in tok.value:
chunk.append(tok)
dirtokens = self.tokenstrip(x[i+1:])
if dirtokens:
name = dirtokens[0].value
@ -644,7 +652,7 @@ class Preprocessor(object):
else:
name = ""
args = []
if name == 'define':
if enable:
for tok in self.expand_macros(chunk):
@ -704,7 +712,7 @@ class Preprocessor(object):
iftrigger = True
else:
self.error(self.source,dirtokens[0].lineno,"Misplaced #elif")
elif name == 'else':
if ifstack:
if ifstack[-1][0]:
@ -874,7 +882,7 @@ class Preprocessor(object):
def parse(self,input,source=None,ignore={}):
self.ignore = ignore
self.parser = self.parsegen(input,source)
# ----------------------------------------------------------------------
# token()
#
@ -904,14 +912,3 @@ if __name__ == '__main__':
tok = p.token()
if not tok: break
print(p.source, tok)

View file

@ -16,7 +16,7 @@ tokens = [
'OR', 'AND', 'NOT', 'XOR', 'LSHIFT', 'RSHIFT',
'LOR', 'LAND', 'LNOT',
'LT', 'LE', 'GT', 'GE', 'EQ', 'NE',
# Assignment (=, *=, /=, %=, +=, -=, <<=, >>=, &=, ^=, |=)
'EQUALS', 'TIMESEQUAL', 'DIVEQUAL', 'MODEQUAL', 'PLUSEQUAL', 'MINUSEQUAL',
'LSHIFTEQUAL','RSHIFTEQUAL', 'ANDEQUAL', 'XOREQUAL', 'OREQUAL',
@ -29,7 +29,7 @@ tokens = [
# Ternary operator (?)
'TERNARY',
# Delimeters ( ) [ ] { } , . ; :
'LPAREN', 'RPAREN',
'LBRACKET', 'RBRACKET',
@ -39,7 +39,7 @@ tokens = [
# Ellipsis (...)
'ELLIPSIS',
]
# Operators
t_PLUS = r'\+'
t_MINUS = r'-'
@ -125,9 +125,3 @@ def t_CPPCOMMENT(t):
r'//.*\n'
t.lexer.lineno += 1
return t

View file

@ -1,7 +1,7 @@
# -----------------------------------------------------------------------------
# ply: lex.py
#
# Copyright (C) 2001-2016
# Copyright (C) 2001-2018
# David M. Beazley (Dabeaz LLC)
# All rights reserved.
#
@ -31,7 +31,7 @@
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# -----------------------------------------------------------------------------
__version__ = '3.10'
__version__ = '3.11'
__tabversion__ = '3.10'
import re
@ -179,12 +179,12 @@ class Lexer:
with open(filename, 'w') as tf:
tf.write('# %s.py. This file automatically created by PLY (version %s). Don\'t edit!\n' % (basetabmodule, __version__))
tf.write('_tabversion = %s\n' % repr(__tabversion__))
tf.write('_lextokens = set(%s)\n' % repr(tuple(self.lextokens)))
tf.write('_lexreflags = %s\n' % repr(self.lexreflags))
tf.write('_lextokens = set(%s)\n' % repr(tuple(sorted(self.lextokens))))
tf.write('_lexreflags = %s\n' % repr(int(self.lexreflags)))
tf.write('_lexliterals = %s\n' % repr(self.lexliterals))
tf.write('_lexstateinfo = %s\n' % repr(self.lexstateinfo))
# Rewrite the lexstatere table, replacing function objects with function names
# Rewrite the lexstatere table, replacing function objects with function names
tabre = {}
for statename, lre in self.lexstatere.items():
titem = []
@ -230,7 +230,7 @@ class Lexer:
titem = []
txtitem = []
for pat, func_name in lre:
titem.append((re.compile(pat, lextab._lexreflags | re.VERBOSE), _names_to_funcs(func_name, fdict)))
titem.append((re.compile(pat, lextab._lexreflags), _names_to_funcs(func_name, fdict)))
self.lexstatere[statename] = titem
self.lexstateretext[statename] = txtitem
@ -495,7 +495,7 @@ def _form_master_re(relist, reflags, ldict, toknames):
return []
regex = '|'.join(relist)
try:
lexre = re.compile(regex, re.VERBOSE | reflags)
lexre = re.compile(regex, reflags)
# Build the index to function map for the matching engine
lexindexfunc = [None] * (max(lexre.groupindex.values()) + 1)
@ -531,12 +531,11 @@ def _form_master_re(relist, reflags, ldict, toknames):
# calling this with s = "t_foo_bar_SPAM" might return (('foo','bar'),'SPAM')
# -----------------------------------------------------------------------------
def _statetoken(s, names):
nonstate = 1
parts = s.split('_')
for i, part in enumerate(parts[1:], 1):
if part not in names and part != 'ANY':
break
if i > 1:
states = tuple(parts[1:i])
else:
@ -758,7 +757,7 @@ class LexerReflect(object):
continue
try:
c = re.compile('(?P<%s>%s)' % (fname, _get_regex(f)), re.VERBOSE | self.reflags)
c = re.compile('(?P<%s>%s)' % (fname, _get_regex(f)), self.reflags)
if c.match(''):
self.log.error("%s:%d: Regular expression for rule '%s' matches empty string", file, line, f.__name__)
self.error = True
@ -782,7 +781,7 @@ class LexerReflect(object):
continue
try:
c = re.compile('(?P<%s>%s)' % (name, r), re.VERBOSE | self.reflags)
c = re.compile('(?P<%s>%s)' % (name, r), self.reflags)
if (c.match('')):
self.log.error("Regular expression for rule '%s' matches empty string", name)
self.error = True
@ -861,7 +860,7 @@ class LexerReflect(object):
# Build all of the regular expression rules from definitions in the supplied module
# -----------------------------------------------------------------------------
def lex(module=None, object=None, debug=False, optimize=False, lextab='lextab',
reflags=0, nowarn=False, outputdir=None, debuglog=None, errorlog=None):
reflags=int(re.VERBOSE), nowarn=False, outputdir=None, debuglog=None, errorlog=None):
if lextab is None:
lextab = 'lextab'
@ -949,8 +948,6 @@ def lex(module=None, object=None, debug=False, optimize=False, lextab='lextab',
# Add rules defined by functions first
for fname, f in linfo.funcsym[state]:
line = f.__code__.co_firstlineno
file = f.__code__.co_filename
regex_list.append('(?P<%s>%s)' % (fname, _get_regex(f)))
if debug:
debuglog.info("lex: Adding rule %s -> '%s' (state '%s')", fname, _get_regex(f), state)
@ -1041,6 +1038,8 @@ def lex(module=None, object=None, debug=False, optimize=False, lextab='lextab',
outputdir = os.path.dirname(srcfile)
try:
lexobj.writetab(lextab, outputdir)
if lextab in sys.modules:
del sys.modules[lextab]
except IOError as e:
errorlog.warning("Couldn't write lextab module %r. %s" % (lextab, e))
@ -1097,4 +1096,3 @@ def TOKEN(r):
# Alternative spelling of the TOKEN decorator
Token = TOKEN

View file

@ -1,7 +1,7 @@
# -----------------------------------------------------------------------------
# ply: yacc.py
#
# Copyright (C) 2001-2016
# Copyright (C) 2001-2018
# David M. Beazley (Dabeaz LLC)
# All rights reserved.
#
@ -32,7 +32,7 @@
# -----------------------------------------------------------------------------
#
# This implements an LR parser that is constructed from grammar rules defined
# as Python functions. The grammer is specified by supplying the BNF inside
# as Python functions. The grammar is specified by supplying the BNF inside
# Python documentation strings. The inspiration for this technique was borrowed
# from John Aycock's Spark parsing system. PLY might be viewed as cross between
# Spark and the GNU bison utility.
@ -64,10 +64,9 @@ import types
import sys
import os.path
import inspect
import base64
import warnings
__version__ = '3.10'
__version__ = '3.11'
__tabversion__ = '3.10'
#-----------------------------------------------------------------------------
@ -268,6 +267,9 @@ class YaccProduction:
def lexpos(self, n):
return getattr(self.slice[n], 'lexpos', 0)
def set_lexpos(self, n, lexpos):
self.slice[n].lexpos = lexpos
def lexspan(self, n):
startpos = getattr(self.slice[n], 'lexpos', 0)
endpos = getattr(self.slice[n], 'endlexpos', startpos)
@ -1360,7 +1362,7 @@ class Production(object):
p = LRItem(self, n)
# Precompute the list of productions immediately following.
try:
p.lr_after = Prodnames[p.prod[n+1]]
p.lr_after = self.Prodnames[p.prod[n+1]]
except (IndexError, KeyError):
p.lr_after = []
try:
@ -2301,7 +2303,6 @@ class LRGeneratedTable(LRTable):
# -----------------------------------------------------------------------------
def dr_relation(self, C, trans, nullable):
dr_set = {}
state, N = trans
terms = []
@ -2735,6 +2736,7 @@ class LRGeneratedTable(LRTable):
f.write('''
# %s
# This file is automatically generated. Do not edit.
# pylint: disable=W,C,R
_tabversion = %r
_lr_method = %r
@ -2968,28 +2970,20 @@ class ParserReflect(object):
# Compute a signature over the grammar
def signature(self):
parts = []
try:
from hashlib import md5
except ImportError:
from md5 import md5
try:
sig = md5()
if self.start:
sig.update(self.start.encode('latin-1'))
parts.append(self.start)
if self.prec:
sig.update(''.join([''.join(p) for p in self.prec]).encode('latin-1'))
parts.append(''.join([''.join(p) for p in self.prec]))
if self.tokens:
sig.update(' '.join(self.tokens).encode('latin-1'))
parts.append(' '.join(self.tokens))
for f in self.pfuncs:
if f[3]:
sig.update(f[3].encode('latin-1'))
parts.append(f[3])
except (TypeError, ValueError):
pass
digest = base64.b16encode(sig.digest())
if sys.version_info[0] >= 3:
digest = digest.decode('latin-1')
return digest
return ''.join(parts)
# -----------------------------------------------------------------------------
# validate_modules()
@ -3080,7 +3074,7 @@ class ParserReflect(object):
self.error = True
return
self.tokens = tokens
self.tokens = sorted(tokens)
# Validate the tokens
def validate_tokens(self):
@ -3240,9 +3234,13 @@ def yacc(method='LALR', debug=yaccdebug, module=None, tabmodule=tab_module, star
if module:
_items = [(k, getattr(module, k)) for k in dir(module)]
pdict = dict(_items)
# If no __file__ attribute is available, try to obtain it from the __module__ instead
# If no __file__ or __package__ attributes are available, try to obtain them
# from the __module__ instead
if '__file__' not in pdict:
pdict['__file__'] = sys.modules[pdict['__module__']].__file__
if '__package__' not in pdict and '__module__' in pdict:
if hasattr(sys.modules[pdict['__module__']], '__package__'):
pdict['__package__'] = sys.modules[pdict['__module__']].__package__
else:
pdict = get_caller_module_dict(2)
@ -3484,6 +3482,8 @@ def yacc(method='LALR', debug=yaccdebug, module=None, tabmodule=tab_module, star
if write_tables:
try:
lr.write_table(tabmodule, outputdir, signature)
if tabmodule in sys.modules:
del sys.modules[tabmodule]
except IOError as e:
errorlog.warning("Couldn't create %r. %s" % (tabmodule, e))

View file

@ -3,7 +3,7 @@
# This is a support program that auto-generates different versions of the YACC parsing
# function with different features removed for the purposes of performance.
#
# Users should edit the method LParser.parsedebug() in yacc.py. The source code
# Users should edit the method LRParser.parsedebug() in yacc.py. The source code
# for that method is then used to create the other methods. See the comments in
# yacc.py for further details.
@ -67,8 +67,3 @@ def main():
if __name__ == '__main__':
main()

View file

@ -17,7 +17,7 @@ PLY is extremely easy to use and provides very extensive error checking.
It is compatible with both Python 2 and Python 3.
""",
license="""BSD""",
version = "3.10",
version = "3.11",
author = "David Beazley",
author_email = "dave@dabeaz.com",
maintainer = "David Beazley",

View file

@ -3,5 +3,6 @@ conditions. To run:
$ python testlex.py
$ python testyacc.py
$ python testcpp.py
The script 'cleanup.sh' cleans up this directory to its original state.

26
test/lex_optimize4.py Normal file
View file

@ -0,0 +1,26 @@
# -----------------------------------------------------------------------------
# lex_optimize4.py
# -----------------------------------------------------------------------------
import re
import sys
if ".." not in sys.path: sys.path.insert(0,"..")
import ply.lex as lex
tokens = [
"PLUS",
"MINUS",
"NUMBER",
]
t_PLUS = r'\+?'
t_MINUS = r'-'
t_NUMBER = r'(\d+)'
def t_error(t):
pass
# Build the lexer
lex.lex(optimize=True, lextab="opt4tab", reflags=re.UNICODE)
lex.runmain(data="3+4")

101
test/testcpp.py Normal file
View file

@ -0,0 +1,101 @@
from unittest import TestCase, main
from multiprocessing import Process, Queue
from six.moves.queue import Empty
import sys
if ".." not in sys.path:
sys.path.insert(0, "..")
from ply.lex import lex
from ply.cpp import *
def preprocessing(in_, out_queue):
out = None
try:
p = Preprocessor(lex())
p.parse(in_)
tokens = [t.value for t in p.parser]
out = "".join(tokens)
finally:
out_queue.put(out)
class CPPTests(TestCase):
"Tests related to ANSI-C style lexical preprocessor."
def __test_preprocessing(self, in_, expected, time_limit = 1.0):
out_queue = Queue()
preprocessor = Process(
name = "PLY`s C preprocessor",
target = preprocessing,
args = (in_, out_queue)
)
preprocessor.start()
try:
out = out_queue.get(timeout = time_limit)
except Empty:
preprocessor.terminate()
raise RuntimeError("Time limit exceeded!")
else:
self.assertMultiLineEqual(out, expected)
def test_concatenation(self):
self.__test_preprocessing("""\
#define a(x) x##_
#define b(x) _##x
#define c(x) _##x##_
#define d(x,y) _##x##y##_
a(i)
b(j)
c(k)
d(q,s)"""
, """\
i_
_j
_k_
_qs_"""
)
def test_deadloop_macro(self):
# If there is a word which equals to name of a parametrized macro, then
# attempt to expand such word as a macro manages the parser to fall
# into an infinite loop.
self.__test_preprocessing("""\
#define a(x) x
a;"""
, """\
a;"""
)
def test_index_error(self):
# If there are no tokens after a word ("a") which equals to name of
# a parameterized macro, then attempt to expand this word leads to
# IndexError.
self.__test_preprocessing("""\
#define a(x) x
a"""
, """\
a"""
)
main()

View file

@ -514,6 +514,26 @@ class LexBuildOptionTests(unittest.TestCase):
except OSError:
pass
def test_lex_optimize4(self):
# Regression test to make sure that reflags works correctly
# on Python 3.
for extension in ['py', 'pyc']:
try:
os.remove("opt4tab.{0}".format(extension))
except OSError:
pass
run_import("lex_optimize4")
run_import("lex_optimize4")
for extension in ['py', 'pyc']:
try:
os.remove("opt4tab.{0}".format(extension))
except OSError:
pass
def test_lex_opt_alias(self):
try:
os.remove("aliastab.py")