Often, we find the need to import a non-class based Python module into our own modules. Sure, many proper Python packages exist that we can use to do our dirty work (https://docs.python.org/2/tutorial/modules.html#packages), but what about when we run into a random script that simply uses its own
main() and various functions to get the job done?
If we try to import a Python script, we run an issue if the sucker uses any arguments. Why? The main reason is that in our imported script, we will most likely have good ‘ol:
if __name__ == "__main__": main()
When a script is called in Python,
__name__ will be
__main__. No biggie, right? We’ll just import the module and then call the
main() function, taking care to pass the arguments we want into the function. Right? WRONG. The problem we run into is that
main() does not take arguments. Rather, the
main() function most likely uses
getopt, or another parser to parse command-line arguments passed to the script.
So, how do we overcome this issue?
One way that I have found to work well is to simply clobber
sys.argv. This is just the list that Python keeps of the arguments passed to a script: https://docs.python.org/2/library/sys.html#sys.argv.
This method works best when also using a configuration file that users can edit with the paths to their local copies of the scripts you want to import (most likely git clone directories). By using the config file, users will not need to edit the actual script itself, but rather will edit the config file. This option precludes end users from mucking with the real code. Let’s take a look at a use case.
In one of my domain vetting automation projects, I wanted to use a tool called dnsrecon (https://github.com/darkoperator/dnsrecon). The problem is that dnsrecon was not written to be instantiated as an object, but rather was meant to be run as-is by itself. Let’s take a look at how I was able to import this module for use in my own script:
# This requires that the config file be named `config.py` and be located in the same directory import config import sys # These directories need to be set in config.py! # First, we grab the path from the config file, then we try to append the path to Python's system path try: path_to_dnsrecon = config.path_to_dnsrecon sys.path.append(path_to_dnsrecon) except: print "Path to dnsrecon not set properly in config.py! QUITTING!" sys.exit(0) # Now that the script we want to import is in the system path, we import the sucker import dnsrecon
config.py file contents:
path_to_dnsrecon = "/Users/8bits0fbrain/git/dnsrecon"
Once we have done this, we must then perform the
sys.argv clobbering before we call the script. We can do so as such:
# Create an empty list to hold our results results =  # Store original arguments, as we need to clobber them to call dnsrecon.main() # We store these so that we can return them after we've called our imported script arguments = sys.argv # Setup arguments for dnsrecon sys.argv = ['dnsrecon', '-d', 'malwerewolf.com'] # Trap stdout and run dnsrecon # Also trap the system.exit that dnsrecon implements so that we can continue our program try: old_stdout = sys.stdout sys.stdout = dnsrecon_output = StringIO() dnsrecon.main() except SystemExit: sys.stdout = old_stdout results.append(dnsrecon_output.getvalue()) except: sys.stdout = old_stdout results.append("[!!] dnsrecon failed for this domain.\n") # Finally, we restore our original script arguments sys.argv = arguments # Now we can do whatever we want with the results, like print them for line in results: print line
What’s going on here?
First, we backup our own command line arguments.
Second, it helps to know that
getopt() (@ line 1373 of dnsrecon.py). Thus, we then use
sys.argv = ['dnsrecon', '-d', 'malwerewolf.com']
to clobber (destroy/replace) our own
sys.argv (the command line arguments passed to our script) to match what we want to use for dnsrecon.
sys.argv list always has the script name at index 0 (
sys.argv), so we’re just trowing the name in there for good measure. I use this more for tracking to be honest, since index 0 is usually the FULL PATH to the script (which we could pass if wanted and/or the script required). Most argument parsers start at index 1, which is where the real arguments reside. At our index 1, we have
-d and then at index 2 we have
malwerewolf.com. This effectively makes dnsrecon think that we ran the following:
python dnsrecon.py -d malwerewolf.com
As such, when
getopt() runs, it will think we passed the arguments
Next, we backup and then clobber the normal Python stdout (
sys.stdout) before we call
dnsrecon.main(). We do this because we do not want to modify the dnsrecon code directly. However, when run, dnsrecon prints all results to stdout. By trapping the stdout to the variable
dnsrecon_output, which is a StringIO object, we can manipulate this data as we see fit (regex to find only certain content, run string replacements, etc.).
Here’s the kicker! When dnsrecon completes, it calls
sys.exit. Therefore, we use
except SystemExit: to catch the system exit. If we did not use this, our script too would exit, since the
sys.exit call would kill our Python session. Again, the idea is to avoid modifying the other script in any way, shape, or form. By using this method, even scripts that try to kill the Python session can be imported (this is also why I chose dnsrecon as our example).
That’s it! I really hope this helps someone. I have used this method to import the following tools: dnsrecon, TekDefense-Automater, theHarvester, peepingtom, and more. If this has helped you, or if you have feedback, please leave a comment.