• SUMMARY

    Currently pursuing Masters in Data Science at Indiana University Bloomington, having a professional experience of 3 years and 5 months in Accenture. I am currently looking for internship opportunities for Summer 2017 to implement the knowledge imparted by the university and gained through previous experience into the industry.

    • Exposure in Big Data, Data Analytics, Data Cleansing, Feature Extraction, Algorithms and Programming.
    • Worked on Python and its various libraries for Machine Learning, Web Scrapping and Visualizations.
    • Hands on with Hadoop ecosystem technologies like Map Reduce, Hive, Pig, Sqoop, Flume and Databases.
    • Significant coursework in Statistics, Machine Learning, Artificial Intelligence and Data Science.
    • Have working knowledge in Communication and Finance domain.

  • My Encounters with Data

  • EDUCATION

    Indiana University Bloomington (School Of Informatics and Computing)

    MS in Data Science (Computational and Analytical Track)

    January 2016 to December 2017

    GPA: 3.62

    University of Pune (Sinhgad Academy of Engineering)

    Bachelor of Engineering in Information Technology

    May 2008 to May 2012

    Grade: First Class with Distinction

  • EXPERIENCE

    QxBranch LLC

    Duration - January 2017 to Present

    Title - Data Scientist Intern

    Worked on Stochastic Model for Cricket Fantasy League Prediction

     

    Accenture India Pvt. Ltd.

    Duration- June 2012 to November 2016 (3 years and 6 months)

    Title - Software Engineering Analyst

    • Enterprise Service Delivery System(ESDS)(Client- American Bank) – August 2012 to April 2013
    • SONAR-AML(Anti-Money Laundering) (Client- American Bank)– April 2013 to July 2014
    • CID Rationalization and Inventory Alignment (Client- Telecom Company in UK)– August 2014- November 2015

    Sands Technologies Pvt. Ltd.

    Duration - February 2012 to April 2012 (3 months)

    Title - Project Intern

    Worked on Multi Tracking System

     

  • PROJECTS

    San Francisco Crime Analysis - Kaggle Competition

    Python(Scikit Learn, Numpy) , R(ggmap), Tableau

    Found some meaningful patterns and stories by plotting Spatial and Temporal Data that can give an insight of the Crime scene in San Francisco, like increase in crime rate in the month of October and decrease in December, Major crime hotspots in Tenderloin and Southern region. Took part in Kaggle competition for classification of crime and finished in top 40% out of 2335 teams. Used R(ggmap) and Tableau for visualization and used Python(Scikit Learn, Numpy) for Classification. Worked in a team with one other member for visualization.

    https://github.com/darekarsam/SF-Crime-Kaggle-Contest

    https://www.kaggle.com/gizimodo

    Feedback to Business Entities using Topic modelling and Sentimental Analysis

    Python(NLTK), D3.js, Yelp Dataset

    Implemented Topic Modeling on Yelp Dataset for getting feedback of business entities using Python (NLTK) and deriving which services or products are good, bad and average. Worked in a team with two other members

    https://github.com/naveenkumar2703/Sentiment-Analysis

    Movie Recommendation Engine (K Nearest Neighbor)

    Python, Movielens Dataset

    Created a movie recommendation engine using movielens dataset. Used the k-nearest neighbor approach for clustering and calculated the Mean Absolute difference comparing the different distance functions like Euclidean, Manhattan and Lmax. Implemented the KNN approach from scratch using Python.

    https://github.com/darekarsam/Movie-Recommendation-Engine

    Statistical Analysis of Indian Cricket Team

    Python(Pandas, Scikit Learn, Numpy, Seaborn), Data Scrapping(Beautiful Soup, Request)

    Compared the high profile Indian batsmen by using kernel density estimate, analysed their perfomances in Wins and Losses. Used Linear Regression to predict the number of matches VIrat Kohli will take to break the record of Sachin Tendulkar.

    http://darekarsam.strikingly.com/blog/sachin-scores-a-century-india-loses-the-match

    https://github.com/darekarsam/cricInfoDataAnalysis

    Decision Tree Implementation on UCI Datasets

    Python, UCI Datasets

    Implemented a greedy algorithm that learns a classification tree given a data set using Gini and Entropy(Information Gain). Evaluation using Cross Fold Validation and implementing overfitting prevention methods from scratch in Python without libraries for decision Tree.

    https://github.com/darekarsam/Implementing-Classification-Tree

    Association Rules Implementation on UCI Datasets

    Python, UCI Datasets

    Implemented the Apriori algorithm by first determining frequent itemsets using methods Fk−1 × F1 and Fk−1 × Fk−1 for itemset generation and then proceeding to identify association rules. Implementing confidence-based pruning and lift as the measure of rule interestingness to enumerate all association rules for a given set of frequent itemsets from scratch using Python without any libraries

    https://github.com/darekarsam/Association-Rules-Implementation-using-UCI-Datasets

    Data Gathering, Analysis of Voter Data and Sentimental Analysis.

    Python(Beautiful Soup, Request,NLTK), Tableau 

    Part of Data Gathering team of Member of Legislative Assembly of Maharashtra during his campaign for the 2014 Lok Sabha Elections in India. Scraping the web for the voter data on the website of Election Commission of India using beautiful soup (Python library for parsing) and Requests (an Http library for Python). Created sentimental analysis engine using SVM and NLTK for data collected via his Twitter feeds and then visualizing the feedback using Tableau in various ways possible.

    Multi Tracking System (Sands Technologies)

    C#.Net,.Net Framework, MS SQL

    Implemented a web based GPS Tracking System for the fulfilment of Bachelor’s Degree requirement. Socket Programming using C# to receive data from the tracking device and storing to server. Creating SQL jobs for handling the large flow of data, creating live track using KML for Google Map, algorithmic implementation of Geo fencing Module

    Presented Papers in International Conference and Journal for the Multi Tracking System – a web based GPS tracking system.

    Enterprise Service Delivery System (Accenture)

    DB Tester

    ESDS was built to resolve all customer grievances in a single frame; the system was integrated from other systems handling nuclear tasks. Worked in Orchestration Engine team in US shifts to work with the clients. Examined the backend flow of application from upstream and downstream applications. Automated the manual task of validation of data in the system with the upstream and downstream application. Handled daily calls with the onshore team and communicated the workflow to the offshore team.

    SONAR-AML(Anti-Money Laundering) (Accenture)

    Hive, Map Reduce  (Hadoop) and SQL Developer

    Built to identify potential money laundering customers and flag them for future transactions. Migration of Historical data of customers to HDFS as mainframes(VSAM, Flat files, DB2 files) take longer time to process the data with batch processing to avoid delay of report generation. Writing UDFs for generating reports of potential money launderers which were flagged on the basis of transactions performed. Awarded with Accenture Stellar Award for outstanding performance and also completed Accenture’s Banking Generalist Certification Program.

    CID Rationalization and Inventory Alignment (Accenture)

    SQL Developer

    BFG (Big Friendly Giant) DB which was created for maintaining inventories of customers was out of structured due to improper design, considering the exponential growth in data. Root Cause analysis of the anomalous data and cleaning data with help of Python Scripts was performed.

  • PUBLICATIONS

    Tracking System using GPS and GSM: Practical Approach

    International Journal of Scientific and Engineering Research Volume 3, Issue 5, May-2012, ISSN 2229-5518

    http://www.ijser.org/onlineResearchPaperViewer.aspx?Tracking-System-using-GPS-and-GSM-Practical-Approach.pdf

    Multi Tracking System using GPS and GSM

    Proceedings of Advances in Computer and Communication Technology (ACCT - 2012) organized by Institute of Electronics and Telecommunication Engineers (IETE) Mumbai.

  • CONTACT

    sdarekar@iu.edu

    sdarekar@iu.edu

    +1 812 671 3114

    LinkedIn Profile

    LinkedIn

    GitHub Profile

    GitHub

  • Twitter Feed

  • QUICK EMAIL

All Posts
×