Importing Python Scripts: Clobbering sys.argv

by 8bits0fbr@in
Categories: Coding, Python
Tags: , , , , , , , , , ,
Comments: Leave a Comment

Often, we find the need to import a non-class based Python module into our own modules. Sure, many proper Python packages exist that we can use to do our dirty work (https://docs.python.org/2/tutorial/modules.html#packages), but what about when we run into a random script that simply uses its own main() and various functions to get the job done?

If we try to import a Python script, we run an issue if the sucker uses any arguments. Why? The main reason is that in our imported script, we will most likely have good ‘ol:

if __name__ == "__main__":

When a script is called in Python, __name__ will be __main__. No biggie, right? We’ll just import the module and then call the main() function, taking care to pass the arguments we want into the function. Right? WRONG. The problem we run into is that main() does not take arguments. Rather, the main() function most likely uses argparse, optparse, getopt, or another parser to parse command-line arguments passed to the script.

So, how do we overcome this issue?

One way that I have found to work well is to simply clobber sys.argv. This is just the list that Python keeps of the arguments passed to a script: https://docs.python.org/2/library/sys.html#sys.argv.

This method works best when also using a configuration file that users can edit with the paths to their local copies of the scripts you want to import (most likely git clone directories). By using the config file, users will not need to edit the actual script itself, but rather will edit the config file. This option precludes end users from mucking with the real code. Let’s take a look at a use case.

In one of my domain vetting automation projects, I wanted to use a tool called dnsrecon (https://github.com/darkoperator/dnsrecon). The problem is that dnsrecon was not written to be instantiated as an object, but rather was meant to be run as-is by itself. Let’s take a look at how I was able to import this module for use in my own script:

Example code:

# This requires that the config file be named `config.py` and be located in the same directory
import config
import sys

# These directories need to be set in config.py!
# First, we grab the path from the config file, then we try to append the path to Python's system path
    path_to_dnsrecon = config.path_to_dnsrecon
    print "Path to dnsrecon not set properly in config.py!  QUITTING!"

# Now that the script we want to import is in the system path, we import the sucker
import dnsrecon

Example config.py file contents:

path_to_dnsrecon = "/Users/8bits0fbrain/git/dnsrecon"

Once we have done this, we must then perform the sys.argv clobbering before we call the script. We can do so as such:

    # Create an empty list to hold our results
    results = []

    # Store original arguments, as we need to clobber them to call dnsrecon.main()
    # We store these so that we can return them after we've called our imported script
    arguments = sys.argv

    # Setup arguments for dnsrecon
    sys.argv = ['dnsrecon', '-d', 'malwerewolf.com']

    # Trap stdout and run dnsrecon
    # Also trap the system.exit that dnsrecon implements so that we can continue our program
        old_stdout = sys.stdout
        sys.stdout = dnsrecon_output = StringIO()
    except SystemExit:
        sys.stdout = old_stdout
        sys.stdout = old_stdout
        results.append("[!!] dnsrecon failed for this domain.\n")

    # Finally, we restore our original script arguments
    sys.argv = arguments

    # Now we can do whatever we want with the results, like print them
    for line in results:
        print line

What’s going on here?

First, we backup our own command line arguments.

Second, it helps to know that dnsrecon.main() uses getopt() (@ line 1373 of dnsrecon.py). Thus, we then use

sys.argv = ['dnsrecon', '-d', 'malwerewolf.com']

to clobber (destroy/replace) our own sys.argv (the command line arguments passed to our script) to match what we want to use for dnsrecon.

The sys.argv list always has the script name at index 0 (sys.argv[0]), so we’re just trowing the name in there for good measure. I use this more for tracking to be honest, since index 0 is usually the FULL PATH to the script (which we could pass if wanted and/or the script required). Most argument parsers start at index 1, which is where the real arguments reside. At our index 1, we have -d and then at index 2 we have malwerewolf.com. This effectively makes dnsrecon think that we ran the following:

python dnsrecon.py -d malwerewolf.com

As such, when getopt() runs, it will think we passed the arguments -d malwerewolf.com.

Next, we backup and then clobber the normal Python stdout (sys.stdout) before we call dnsrecon.main(). We do this because we do not want to modify the dnsrecon code directly. However, when run, dnsrecon prints all results to stdout. By trapping the stdout to the variable dnsrecon_output, which is a StringIO object, we can manipulate this data as we see fit (regex to find only certain content, run string replacements, etc.).

Here’s the kicker! When dnsrecon completes, it calls sys.exit. Therefore, we use except SystemExit: to catch the system exit. If we did not use this, our script too would exit, since the sys.exit call would kill our Python session. Again, the idea is to avoid modifying the other script in any way, shape, or form. By using this method, even scripts that try to kill the Python session can be imported (this is also why I chose dnsrecon as our example).

That’s it! I really hope this helps someone. I have used this method to import the following tools: dnsrecon, TekDefense-Automater, theHarvester, peepingtom, and more. If this has helped you, or if you have feedback, please leave a comment.

Thanks gang!

Fer Shizzle.

Leave a Reply

Your email address will not be published. Required fields are marked *

Today is Monday