Saturday, February 17, 2018

Tools and Skills for Information Security Practitioners: Python Part 1

This series, Tools and Skills for Information Security Practitioners will touch on what I have discovered over the years that can take a career in security to the next level.

There are a ton of general skills that someone in security should have and there are a ton of other resources out there to help someone get into the security field. I want to help take someone to the next level. Going to the next level could mean a couple of things - first, it is my opinion that the best security people have found a niche or two within security that they really enjoy and second is to continue to hone the basic skills to be able to do in your sleep.

About me: see my previous blog post about how I got to where I am in general.

The reason I am starting this series is that I see way too many people wanting to get into security to be a pen tester or, to a lesser extent, to do forensics. While I will never begrudge a person for wanting to do something specific, many of these folks seem to think that pen testing is what the cool kids are doing and that is what they see in the media as being what security is...

However, if you just concentrate on what everyone wants to do, getting into or progressing in the security field is going to be difficult as the pen testing discipline becomes saturated.

Don't get me wrong, there are a bunch of great pen testers and pen testing is a core tenant of a secure organization and needs people to perform these tests.

But, there are probably dozens of other disciplines in security that are not as visible or as "cool" as pen testing, but are equally, if not, in my opinion, more important. A pen tester can tell an organization where their weaknesses are which may be the way a bad guy got in, but the pen tester can't tell you that there is a bad guy already in the network taking advantages of those weaknesses and stealing your data... And there are already a ton of pen testing resources out there.

I want to concentrate on what I feel are some areas that are not as discussed or tutorials provided for the new or junior security practitioner.

In this first post, I want to concentrate on Python from a security practitioner's stand point. This will absolutely not be an all-encompassing Python tutorial or step-by-step learning Python post. This will concentrate on skills, modules and techniques that I have found helpful in the security monitoring, response, threat intel and threat hunting disciplines.

I made a decision about a year ago, I made the move to Python 3 from Python 2 - although there is no true technical reason for using 3 instead of 2, version 3 tries to improve on the basic programatic features - such as making the print command a function, so version 3 uses print("Hello World") vs. version 2's print "Hello World". Python 2 has some modules (such as __future__) that can help in the transition or help run version 3 scripts in version 2. I try in my scripts that I make public to put in checks for 2 vs 3, but in some cases, if my scripts do not run - it's probably this...

I am not a Programmer or Developer

I have found a ton of uses of Python in what I do. Let me caveat all of this with I do not consider myself a programmer or developer, I create scripts to solve common tasks or to provide integrations with other systems or data. I AM NOT A DEVELOPER, so if you bitch about my code formatting, the efficiency of my code or anything that would make a developer a developer - I will ignore it. Although, constructive comments on making something more efficient, I may at least read the comment, I do hate slow scripts... You have been warned. :)

Same with the modules I talk about below - there are hundreds of Python modules in the community, which I absolutely love - and as I am trying to solve a problem, if I find a module that works and seems to be pretty robust, I will likely stick with it. Prime example - requests vs. urllib2 - urllib2 may be a bit more powerful than requests and you may see me use it from time to time, but for general web or API calls, I like requests a bit more, so that is what I use...

Technical Setup

  • Python 3.x: to put it bluntly - make the move to Python 3, it has been out for years, is stable and is the future. Especially if you are just learning - learn on v3. I try to make my scripts backwards compatible, but I do not test my scripts in v2, so be aware if you are looking at my code or scripts
  • iPython: I do a lot of my initial coding of a new script in iPython - I find it a heck of a lot more powerful and better for initial development and testing. The basic python shell works perfectly well, but I do not find it easy to use when doing hard core scripting before putting it into a .py file.
  • Mac OS X vs. Linux vs. Windows: I have become a Mac guy (but don't call me an Apple fan boy - I still have an Android phone...), so most of my scripting is in OS X, but everything will work in Linux or Windows - there may just be some different setup steps - there are many resources that talk about running Python on Linux and Windows...
  • Atom Editor: I have fallen in love with the atom.io editor that is put out by Github. There are a bunch of great packages that I use (base64, language-email, file-icons, etc). There are a ton of other good editors (sublime, CodeRunner) or IDEs (PyCharm), but if you use Github to store and release code, Atom has built-in support for pushing commits to Github - I absolutely love that and stopped even looking at other editors.

Reasons for the skill

  • Security Monitoring: Creating scripts to interact with applications and services' APIs
  • Security Operations: Automate common tasks
  • Threat Intelligence: Develop scripts and interpretations for commonly used web or other services
  • Threat Hunting: consolidating and analyzing large amounts of data
  • Data Analysis: finding trends, creating visualizations, and interpreting indicators

Most used skills

  • Moderate level of understanding of general Python scripting
    • Loading and using modules
    • Populating and using lists and dictionaries
  • Required Modules
    • requests
    • json
    • xmltodict
    • re
    • datetime
    • hashlib
    • os
    • sys
  • Optional Modules that can help a lot
    • configparser
    • pymongo
    • argparse
    • BeautifulSoup
    • pandas
    • numpy
    • wget
    • socket
  • Cool modules that may or may not be used
    • flask
    • arrow

Basic usage of some of the modules

I'll touch on some of the usages of the modules and concepts noted above and give a couple of examples.

requests
The requests module is an alternative to urllib2 and for my purposes requests does just fine and is what I have become accustomed to. Documentation: http://docs.python-requests.org/en/master/ Basic Usage: Here, I will make a request to the site Unshorten.me, which takes shortened links and displays what the full URL is. Why don't I just make the request to the shortened link instead of pushing through http://unshorten.me? My IP address will not show in the shortening service's interface; thus, not alerting the bad guy of an IP that is potentially investigating the link...
import requests s_url = 'http://bit.ly/2GmLHzg' url = 'https://unshorten.me/json/' r = requests.get(url + s_url) print(r.json()['resolved_url'])
json
Many APIs or automated systems use JSON to return data (as you can see in the Requests section above). Some modules, such as requests, do have built-in recognition of JSON, so the additional json module is not necessary. However, if you want to do a bit more with the returned data or save it out to a file, the json module becomes a robust way to deal with JSON data.

Documentation: https://docs.python.org/3/library/json.html

Basic Usage: Below is a snippet of code used to post a message to a Hipchat server.
import requests import json host = "mysite.hipchat.com" token = "ThisIsMyHipchatAPIToken" room = "Threat Intel Notifications" message = "Test from Python" url = "https://{0}/v2/room/{1}/notification".format(host, room) headers = {'Content-type': 'application/json'} headers['Authorization'] = "Bearer " + token payload = {    'message': message,    'notify': False,    'message_format': 'text',    'color': 'green' } r = requests.post(url, data=json.dumps(payload), headers=headers)
In this example, when you dig into the Hipchat API documentation, it will indicate that the post must be sent in JSON format; however, although JSON is, for all intents and purposes, a dictionary in Python, but you cannot send a Python dictionary across to Hipchat, it must be sent as a string. So, to send over a string that inside of it is a JSON object, you must utilize the json.dumps(payload) function in order for requests to be able to send it across.

xmltodict
I cannot stand XML, while XML may be perfectly fine for some applications or scripts, for some reason I cannot fully grasp handling XML in Python (if you have to handle XML in a more pure manner, the Element Tree module [https://docs.python.org/2/library/xml.etree.elementtree.html] is what has worked for me in the past).

So, instead, I will convert XML to a dictionary using xmltodict.

Again, I am more about getting a job done that being the 'leetest scripter...

Documentation:  https://github.com/martinblech/xmltodict

Basic Usage: Fortunately, most APIs are using JSON now, so I do not have a specific security related example right now, so here is something that I used xmltodict for with baseball data.

Until recently, very detailed MLB game data was available from http://gdx.mlb.com/components/game/mlb/ however, something happened recently and now there is a 'The specified key does not exist' error.

If you are wanting MLB data, Retrosheet is current the next best source, just not as detailed. Anyway, while some of the MLB data is available in JSON, most of the more detailed data is only in XML (for example, the individual pitch data in the inning_all.xml file), so needing to start with XML data.
import xmltodict import request game_url = "http://http://gdx.mlb.com/components/game/mlb/year_2017/month_09/day_11/gid_2017_09_11_colmlb_arimlb_1/inning/inning_all.xml" r = requests.get(url) data = r.text.encode('utf-8') doc = xmltodict.parse(data)
The doc variable will be a dictionary of the values from the innings_all.xml file. Note: the xmltodict module does some unique things with the dictionary - any key that is a value and not another list or dictionary is prepended by an '@'. For example, within the doc variable, to access the batter's player id would be: doc['game']['inning'][0]['top']['atbat'][0]['@batter']

re
The re module is another highly used module - it can perform regular expression searches on variables. Documentation: https://docs.python.org/3/library/re.html Basic usage: Here we will search a list of log entries for a specific IP address in a list of generic traffic data.
import re test_log = ["192.168.0.4:44352 -> 192.168.1.6:80", "192.168.0.7:34323 -> 192.168.1.7:443"] for log in list:     if re.search('192.168.0.7', log):         print("Found 192.168.0.7")     else:         print("192.168.0.7 not in log")

Conclusion

That is about all the time I have for today - I will continue this series in the near future.

Please let me know what you think of the post and feel free to contact me at paul[@]ir4n6[.]com.

No comments:

Post a Comment

Tools and Skills for Information Security Practitioners: Python Part 1

This series, Tools and Skills for Information Security Practitioners will touch on what I have discovered over the years that can take a car...