Isolation, no matter where you are, effects nearly every aspect of life. Whether it be in your smaller and more desolate communities, or the transportation access needed for healthcare, education and crops.
Bridges to Prosperity(B2P) is a non-profit U.S. based organization that more than just partners with local governments to connect those less fortunate by way of a bridge. The work tirelessly around the clock to help alleviate poverty and isolation between villages and communities, assessing and then building bridges. Lives are lost, crops are destroyed and families are displaced on a daily basis due to the rivers flooding, and sometimes for days and weeks on end. Bridges to Prosperity helps keep the communities safe and connected, people employed as well as helping to ensure the children get the education they deserve with these bridges.
Before construction on any bridge begins, B2P deploys a team of assessors along side local government officials to determine if the suggested site is acceptable. They then, after establishing what the stakeholders refer to as a buy-in, work alongside their team to provide the proper training and resources to get the job done.
Needs are assessed based solely on field surveys that are conducted by both assessors and structural engineers out in the field. The amount of work produced by this method has B2P struggling to keep up with the pace and demand for these bridges. The challenge has required them to re-evaluate the following:
- A more user friendly way of determining where a bridge can be built
- The most effective way of maximizing the impact of the bridges being built
From the meeting with the Stakeholder, she requested that we take an updated dataset and determine by prediction, whether the false negatives that the structural engineers had flagged for rejection, were indeed a viable rejection or determined a viable site for production.
With that said, there was a significant amount of the data in which we determined was not needed, as it didn’t have anything to do with the task set before us by the stakeholder.
After sufficient discussions and planning, we decided the following deliverables would best suit the stakeholder:
- A database that doesn’t consist of duplicate columns affecting the intended outcome of our predictive models.
- A cleaner and more precise API that allows the web team to interact with the Data Science team’s database by post’s and get requests.
- Most importantly, a final EDA(exploratory data analysis) of the dataset that will allow future teams and stakeholders more precise communication lines, with a leaner and clearer solution to the problem.
THE PLANNING PHASE
From the beginning Team B from Labs28 was more than just a cohesive unit. We took the time to listen to one another before we did anything else. Our DS team was comprised of 4 team members and our Web/UI team was comprised of 4 members before one got hired full-time.
By the end of the first week, we had already decided the architecture that we were going to use for this project in the three weeks that would follow. We created it using the website: whimisical.com.
From there we would navigate all of our steps using our Trello board. It’s a tool that saves lives by helping the teams stay engaged. For the most part, we would pair up and tackle the tasks as teams.
DATA ANALYSIS & CHALLENGES FACED
- The stakeholder provided us with a dataset that would contain 1472 locations for current and possible bridge sites, their GPS locations, as well as the current site status, type and attributes. The information provided included all sites that had been accepted and those that were flagged for rejection.
- Before we could actually begin working on exploring the data, we had to wait for clarification on what we were supposed to do with the ‘Comment’ column. Originally, we had planned on a predictive model but decided that an semi-supervised model consisting of Label Spreading was going to be our best option to maximize our results.
- Other than not being able to hit the ground running with a fresh dataset that nobody had ever used before, the biggest challenge was figuring out what the previous team was trying to accomplish. They ultimately would end up using multiple datasets from outside resources. While the initial thought was to do the same, our team stepped up and created a fresh look to the problem solely using the dataset provided to us by the Stakeholder.
- Towards the end of our EDA(exploratory data analysis), we did however; run into an issues with our model predicting that all 1472 sites were ‘good_sites’. Our model was performing at a level where we were not able to reject the null hypothesis, so we decided to switch it up to a majority class prediction. Originally we hade a definition of “X_final = X_train + X_Val + X_test” when it should have been: “X_final = bridges_df[features].
- Lastly, I would say that working remotely while collaborating was difficult considering time zones.
- We shipped out a new Database where all existing datasets are merged together, and files stored and cleaned up.
FUTURE OF THE PROJECT
- Engagement: In order for this project to move forward and succeed, there needs to be an implementation of increased Stakeholder contact. From the first week on, as teams we are left in the dark left wondering what direction we need to be headed in. As other teams have in the past, we solely relied on assumption, hoping that it wouldn’t backfire. Weekly meeting with individual teams with the Stakeholder and a feedback section that the Stakeholder is able to leave, would be ideal for future teams so that they can respond correctly and in a timely manner.
- Datasets: The information received from the stakeholder was nothing like that of what previous teams had received. There was an agreement across the board with our team that they need to verify the dataset before handing it off to us. They need to be more clear and concise in what the both need and want.
MORAL OF THE STORY
- With a timeline of only 4 weeks to work on this project and the first week being about planning, then concentrating on completing the Sprint Challenges for Labs, and having to schedule our endorsement interview and prepping for that, our team may have only got a solid week on the project. It meant that we took a risk taking a solid week just planning, but planning the unknown. Would we deliver what the client truly wanted?
- A majority of the data could have used a lot more time to clean, to scrape, to model, train and test. However; there was a lot of work that we weren’t familiar with that would have needed additional time, that of what we didn’t have.
- Lastly, this project opened my eyes to a world that I didn’t know truly existed. To think that that employment for women raises by 24% and that each household’s income raises between 10–20% with each bridge being built, is astonishing to me. The feeling of creating something to help someone else’s dream come true, is a very special feeling.