The
Data Science Toolkit is a collection of open source tools wrapped in an easy-to-use REST/JSON interface, and available for download as a virtual machine image.
Some of the tools included are
Boilerpipe,GeoIQ/Shuyler Erle's
Geocoder, and
Geodict.
The Data Science Toolkit is assembled by
Pete Warden in an attempt to get these important data tools in the hands of more developers. The toolkit provides:
- Independence - Never worry about the provider going offline, or charging once you're hooked.
- Security - Run on your intranet, so customer information stays within the firewall.
- Scalability - No API limits. Run a cluster of as many instances as you need.
You can play with a
sandbox he's setup, review the
documentation or grab the
VM and launch an Amazon EC2 instance, using public AMI ami-9e7d8ff7.