Launching Ask GitHub!

Ask GitHub searches latest GitHub public timeline to answer questions and provide insights. Ask GitHub is hosted on Heroku and the technology stack includes languages Python & Node.js , NoSQL databases MongoDB & Neo4j , Flask web server, Bootstrap front-end framework, Typeahead integration and scalable-vector-icons from Font Awesome. Visit Ask GitHub.
Background
I developed an inital prototype for Third Annual GitHub Data Challenge. Initial prototype got me thinking about the awesome potential and amazing data points GitHub public timeline provides. Further iterating on my initial prototype I developed Ask GitHub to search GitHub public timeline in the past 24 hours to answer questions and provide interesting insights. After working on this project for the past few months (mostly during late evenings and weekends) I am pleased to officially announce the launch of Ask GitHub! Code driving Ask GitHub is available on GitHub - let’s collaborate!
Data Gathering
Public GitHub timeline from GitHub Archive is parsed hourly using node.js streaming parser.
Currently event type PushEvent, CreateEvent & WatchEvent are captured.
PushEvent contains information about commits and authors. CreateEvent contains new repositories.
WatchEvent contains information about popular repositories.
Output log of data gathered hourly
Data Storage
Data is stored in MongoDB hosted on Compose. Text index is set on field full_name and search results are sorted by dynamically generated document score.
Documents older than 24 hours are deleted.
Search aggregation pipeline using text index score
Web Framework
Flask is a lightweight web application framework written in Python serving Ask GitHub using reponsive design tempates built using Bootstrap. Bootstrap offers a highly customizable grid sytem that works across different devices, numerous built-in classes for styling and extensive list of components.
Bootstrap badges & accordion

User Experience
Twitter’s Typeahead is integrated into the search box to provide list of pre-defined questions and matching repositories are dynamically generated to guide users. Scalable-vector-icons from Font Awesome are used in addition to text to share interesting data points. FuzzyWuzzy string comparision library is used to provide suggestions and did you mean queries.
Example 1: Questions?

Example 2: Top new repositories

Example 3: User commit frequency

Example 4: Suggestion

Roadmap
Automate recommendation engine using Neo4j and continue to keep interating.
askgithub-commit-frequency.png
Tags
- github
- analytics
- bootstrap
- typeahead
- mongodb
- Neo4j
Harish Chakravarthy is an intrapreneur leveraging technology to make a positive difference.
Interests include API integration, user experience, data visualization and analytics.