The DARPA (Defence Advanced Research Agency) Open Catalog is worth visiting at least once to check out the range of open source software that is available for manipulating big data.
From the website:
The DARPA Open Catalog organizes publically releasable material from DARPA programs, beginning with the XDATA program in the Information Innovation Office (I2O). XDATA is developing an open source software library for big data. DARPA has an open source strategy through XDATA and other I2O programs to help increase the impact of government investments.
Lots of interesting software is available.