JavaScript parser in Python [closed]

There is a JavaScript parser at least in C and Java (Mozilla), in JavaScript (Mozilla again) and Ruby. Is there any currently out there for Python?

I don't need a JavaScript interpreter, per se, just a parser that's up to ECMA-262 standards.

A quick google search revealed no immediate answers, so I'm asking the SO community.

Answers:

Answer

Nowadays, there is at least one better tool, called slimit:

SlimIt is a JavaScript minifier written in Python. It compiles JavaScript into more compact code so that it downloads and runs faster.

SlimIt also provides a library that includes a JavaScript parser, lexer, pretty printer and a tree visitor.

Demo:

Imagine we have the following javascript code:

$.ajax({
    type: "POST",
    url: 'http://www.example.com',
    data: {
        email: '[email protected]',
        phone: '9999999999',
        name: 'XYZ'
    }
});

And now we need to get email, phone and name values from the data object.

The idea here would be to instantiate a slimit parser, visit all nodes, filter all assignments and put them into the dictionary:

from slimit import ast
from slimit.parser import Parser
from slimit.visitors import nodevisitor


data = """
$.ajax({
    type: "POST",
    url: 'http://www.example.com',
    data: {
        email: '[email protected]',
        phone: '9999999999',
        name: 'XYZ'
    }
});
"""

parser = Parser()
tree = parser.parse(data)
fields = {getattr(node.left, 'value', ''): getattr(node.right, 'value', '')
          for node in nodevisitor.visit(tree)
          if isinstance(node, ast.Assign)}

print fields

It prints:

{'name': "'XYZ'", 
 'url': "'http://www.example.com'", 
 'type': '"POST"', 
 'phone': "'9999999999'", 
 'data': '', 
 'email': "'[email protected]'"}
Answer

As pib mentioned, pynarcissus is a Javascript tokenizer written in Python. It seems to have some rough edges but so far has been working well for what I want to accomplish.

Updated: Took another crack at pynarcissus and below is a working direction for using PyNarcissus in a visitor pattern like system. Unfortunately my current client bought the next iteration of my experiments and have decided not to make it public source. A cleaner version of the code below is on gist here

from pynarcissus import jsparser
from collections import defaultdict

class Visitor(object):

    CHILD_ATTRS = ['thenPart', 'elsePart', 'expression', 'body', 'initializer']

def __init__(self, filepath):
    self.filepath = filepath
    #List of functions by line # and set of names
    self.functions = defaultdict(set)
    with open(filepath) as myFile:
        self.source = myFile.read()

    self.root = jsparser.parse(self.source, self.filepath)
    self.visit(self.root)


def look4Childen(self, node):
    for attr in self.CHILD_ATTRS:
        child = getattr(node, attr, None)
        if child:
            self.visit(child)

def visit_NOOP(self, node):
    pass

def visit_FUNCTION(self, node):
    # Named functions
    if node.type == "FUNCTION" and getattr(node, "name", None):
        print str(node.lineno) + " | function " + node.name + " | " + self.source[node.start:node.end]


def visit_IDENTIFIER(self, node):
    # Anonymous functions declared with var name = function() {};
    try:
        if node.type == "IDENTIFIER" and hasattr(node, "initializer") and node.initializer.type == "FUNCTION":
            print str(node.lineno) + " | function " + node.name + " | " + self.source[node.start:node.initializer.end]
    except Exception as e:
        pass

def visit_PROPERTY_INIT(self, node):

    # Anonymous functions declared as a property of an object
    try:
        if node.type == "PROPERTY_INIT" and node[1].type == "FUNCTION":
            print str(node.lineno) + " | function " + node[0].value + " | " + self.source[node.start:node[1].end]
    except Exception as e:
        pass


def visit(self, root):

    call = lambda n: getattr(self, "visit_%s" % n.type, self.visit_NOOP)(n)
    call(root)
    self.look4Childen(root)
    for node in root:
        self.visit(node)

filepath = r"C:\Users\dward\Dropbox\juggernaut2\juggernaut\parser\test\data\jasmine.js"
outerspace = Visitor(filepath)
Answer

I have translated esprima.js to Python:

https://github.com/PiotrDabkowski/pyjsparser

It's a manual translation so its very fast, takes about 1 second to parse angular.js file (so 100k characters per second). It supports whole ECMAScript 5.1 and parts of version 6 - for example Arrow functions, const, let.

Alternatively you can use automated translation of newer version of esprima to python which works great and supports whole JavaScript 6!

Answer

ANTLR, ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages.

The ANTLR site provides many grammars , including one for JavaScript.

As it happens, there is a Python API available - so you can call the lexer (recognizer) generated from the grammar directly from Python (good luck).

Answer

You can try python-spidermonkey It is a wrapper over spidermonkey which is codename for Mozilla's C implementation of javascript.

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us Javascript

©2020 All rights reserved.