llvm-6502/utils/DSAextract.py
Patrick Meredith 7e1cf1c8f4 This is a script to extract nodes and edges associated with those nodes
from a dot file that is the output of DSA.  Nodes to extract
are specified by giving the name of the node seen in the graphical
representation, i.e. in the .ps if the node is specified %xyz
asking for just x, xy, or xyz will retain it in the output file.
Because it operates on substrings underspecifying may result
in additional unexpected nodes.  Be as specific as possible.
Obviously, however, if you ask for %xyz and there is a
getelementptr of %xyz you will get both nodes.  Some manual
editing may still be necessary because of this, but this script
can pare down 10,000 line files to 20 line files, making like easier.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24851 91177308-0d34-0410-b5e6-96231b3b80d8
2005-12-19 01:23:31 +00:00

111 lines
3.3 KiB
Python

#! /usr/bin/python
#this is a script to extract given named nodes from a dot file, with
#the associated edges. An edge is kept iff for edge x -> y
# x and y are both nodes specified to be kept.
#known issues: if a line contains '->' and is not an edge line
#problems will occur. If node labels do not begin with
#Node this also will not work. Since this is designed to work
#on DSA dot output and not general dot files this is ok.
#If you want to use this on other files rename the node labels
#to Node[.*] with a script or something. This also relies on
#the length of a node name being 13 characters (as it is in all
#DSA dot output files)
#Note that the name of the node can be any substring of the actual
#name in the dot file. Thus if you say specify COLLAPSED
#as a parameter this script will pull out all COLLAPSED
#nodes in the file
#Specifying escape characters in the name like \n also will not work,
#as Python
#will make it \\n, I'm not really sure how to fix this
#currently the script prints the names it is searching for
#to STDOUT, so you can check to see if they are what you intend
import re
import string
import sys
if len(sys.argv) < 3:
print 'usage is ./DSAextract <dot_file_to_modify> \
<output_file> [list of nodes to extract]'
#open the input file
input = open(sys.argv[1], 'r')
#construct a set of node names
node_name_set = set()
for name in sys.argv[3:]:
node_name_set |= set([name])
#construct a list of compiled regular expressions from the
#node_name_set
regexp_list = []
for name in node_name_set:
regexp_list.append(re.compile(name))
#used to see what kind of line we are on
nodeexp = re.compile('Node')
#used to check to see if the current line is an edge line
arrowexp = re.compile('->')
node_set = set()
#read the file one line at a time
buffer = input.readline()
while buffer != '':
#filter out the unecessary checks on all the edge lines
if not arrowexp.search(buffer):
#check to see if this is a node we are looking for
for regexp in regexp_list:
#if this name is for the current node, add the dot variable name
#for the node (it will be Node(hex number)) to our set of nodes
if regexp.search(buffer):
node_set |= set([re.split('\s+',buffer,2)[1]])
buffer = input.readline()
#test code
#print '\n'
print node_name_set
#print node_set
#open the output file
output = open(sys.argv[2], 'w')
#start the second pass over the file
input = open(sys.argv[1], 'r')
buffer = input.readline()
while buffer != '':
#there are three types of lines we are looking for
#1) node lines, 2) edge lines 3) support lines (like page size, etc)
#is this an edge line?
#note that this is no completely robust, if a none edge line
#for some reason contains -> it will be missidentified
#hand edit the file if this happens
if arrowexp.search(buffer):
#check to make sure that both nodes are in the node list
#if they are print this to output
nodes = arrowexp.split(buffer)
nodes[0] = string.strip(nodes[0])
nodes[1] = string.strip(nodes[1])
if nodes[0][:13] in node_set and \
nodes[1][:13] in node_set:
output.write(buffer)
elif nodeexp.search(buffer): #this is a node line
node = re.split('\s+', buffer,2)[1]
if node in node_set:
output.write(buffer)
else: #this is a support line
output.write(buffer)
buffer = input.readline()