Musings of a FinTech – Actionable Insights from Social Media

Being able to effectively mine data produced via social media is very topical, with the emergence of companies such as Thinknum providing metrics from social media and other sources to provide new insights into company performance.

However, there is a wider question here of what data can actually be harnessed to provide genuine insight and tangible value to both companies and/or individuals – the classic problem of extracting the signal from the noise.

Thinknum provides company metrics such as Twitter/Facebook followers, employees on LinkedIn and web site traffic which arguably could be useful indicators of a company’s health for investors. A recent FT article provides a good run down of some of the current crop of investor-offerings in this space.


In the area of housing, a recently published piece of research from Harvard, Facebook, NYU and the Bureau for Economic Research provides one such insight using data from Facebook. Entitled “Social Networks and Housing Markets”, it looks at how social media influences an individuals perception of the attractiveness of property investment.

The key takeaway from the paper is “Individuals whose friends experienced a 5 percentage points larger house price increase over the previous 24 months (i) are 3.1 percentage points more likely to transition from renting to owning over a two-year period, (ii) buy a 1.7 percent larger house, and (iii) pay 3.3 percent more for a given house. Similarly, when homeowners’ friends experience less positive house price changes, these homeowners are more likely to become renters, and more likely to sell their property at a lower price.”

It’s interesting to see how they combined the data sources for it – the model used Facebook user data along with market research data from Acxiom at its core to build rich demographic data.

One of the key uses of the Facebook friend data was the location of where an individual’s friends reside – specifically those that are within or outside of the Los Angeles county commuting zone (they surveyed homeowners all resided in LA county). This enabled the researchers to distinguish between local friend influences and biases, versus those further afield – the assumption being that house price movements experienced by friends outside the commuting zone would have been effected via social media channels (sec 1.4 p10).

This was supplemented with the relevant housing data, and a 4 question multiple-choice survey for testing the various hypotheses:


  1. How informed are you about house prices in your zip code?

[x] Not at all informed [x] Somewhat informed [x] Well informed [x] Very well informed


  1. How informed are you about house prices where your friends live?

[x] Not at all informed [x] Somewhat informed [x] Well informed [x] Very well informed


  1. How often do you talk to your friends about whether buying a house is a good investment?

[x] Never [x] Rarely [x] Sometimes [x] Often


  1. If someone had a large sum of money that they wanted to invest, would you say that relative to other possible financial investments, buying property in your zip code today is:

[x] A very good investment [x] A somewhat good investment [x] Neither good nor bad as an investment [x] A somewhat bad investment [x] A very bad investment


The ordering of the questions in 35% of the surveys was changed to avoid the framing effect with people’s responses, which was an interesting point to note (although they didn’t find participants were influenced by ordering of questions in this instance).

This survey & demographic data was then utilised alongside housing transaction data, and they created a number of regression models which supported the conclusions of the paper.


Given all of the talk about social media & mining this data, it’s a useful paper to be aware of and illustrates not only a potential use case for harnessing the power of social media to generate insights, but also how complex a task it is to do so.

Bearing in mind that the individuals in the survey were limited to those residing in LA county, and the measured impact of social media appears to influence individuals by up to ~3% which is pretty small in the grand scheme of things*, trying to apply a similar model to something similar like how social networks influence an individuals mortgage preference would be no small task!


*We are very excited overall that new data can not only lead to new ways of analysing risk but potentially be a strong leading indicator, allowing more time to rebalance portfolio risk. However making a judgment call on new data presents higher modelling risk.


Read More

Architecture Evolution

Like most startups, Huffle’s website platform has undergone a number of changes during the past year. The path it has followed is pretty typical, with each platform evolution reflecting the increase in technical investment required to grow from one stage to the next.

I thought it may be useful to share this evolution, given it’s such a common sequence of phases, I hope it may save on some of the research you have to do yourself.


Phase 1 – The Hosted Landing Page

Huffle’s initial site contained a single page used as a lead generation tool. There are many different platforms that can be used here including WordPress, Wix, Instapage and Unbounce, which are some of the popular options. Each of these platforms provide online editors for designing and writing the relevant content you want to display. They also typically provide integrations with 3rd party services for capturing leads, such as MailChimp/Campaign Monitor for e-mail lists, Salesforce/Zoho for CRM.

Our very early site was hosted on Wix, but we preferred the landing page templates on Instapage, so moved across mid-last year.

Once created, you simply point your site DNS record to the hosting provider and away you go with your landing page.


Phase 1 - Instapage


You’re completely at the mercy of your landing page provider – if they go down, there’s very little recourse you can take, but at least you have a web presence.


Phase 2 – Platform as a Service (PaaS)

The landing page was never considered more then a temporary web presence. We were able to import chunks of HTML/CSS/JavaScript into the page via the hosting platform, but we simply couldn’t customise the look and feel as much as we wanted to. Additionally, we wanted to throw a database into the mix to start capturing real customer data, so we needed to start building out a proper customer site using a web framework.

The most common choices in this space tend to by with a dynamically typed language such as Ruby (Rails), Python (Django, Flask), PHP (Cake) or JavaScript (Node.js, React, AngularJS), as they tend to be good for getting something up and running quickly. You can go with a statically typed language (Go, Java, .NET, Scala, Haskell, …), but they tend not to be as fast to get something live out (unless you’re far more comfortable with statically typed languages).

The target deployment infrastructure is pretty straight forwards, consisting of web and database servers.

However, getting a nice automated deployment process up and running takes time, plus the underlying severs need to be managed, which is where Platform as a Service (PaaS) solutions came in. We used Heroku, as it provided a ready made platform for serving up applications in a number of different languages.

It provides a single command to deploy our latest code base out to their platform running on top of Amazon Web Services. Additional web servers (dynos in Heroku speak) can be freely added or removed to scale up/down your site as needs dictate, making it an ideal platform during the early stages of your startup.

Heroku also provides a marketplace for add-ins, making it really straight forwards to add additional functionality (sending email, hosting over SSL, application monitoring, …) with a minimal amount of effort. You can also make use of tools such as loader.io to easily see how your site performs under moderate loads (hundreds of requests per second) to ensure your site can handle those initial burst of publicity.


Phase 2 - Heroku


As great as Heroku was for getting our web application up and running quickly. There were some limitations that were frustrating to work with:

  • You cannot jump onto a server to have a dig around – everything is done via the Heroku logs command
  • Heroku runs on top of AWS across a limited number of regions – none of which are in Australia.
  • You cannot run a Heroku application out of multiple regions with duplicating your entire platform including the database server (which is expensive) in both sites. Plus you’ll need to find a way to synchronise your database. This means that when there is a problem in the AWS region your Heroku instance is running in or in Heroku itself, you have zero options for redundancy unless you duplicate your infrastructure.

Heroku does provide a status page which is useful, but if site availability is crucial to you, these issues are too great to rely on it as a solo hosting platform, which is why we made the move to AWS, which provided us with a greater degree of flexibility with our deployment/management options.


Phase 3 – Infrastructure as a Service (IaaS)

In the world of Infrasutructure as a Service (IaaS), Amazon Web Services is king. There are a number of other IaaS platforms to choose from, however, given its relative maturity, it being platform of choice for so many startup success stories, and it’s Activate Program for startups, it was a no-brainer for us.

Amazon Web Services provides resilience across multiple geographic regions. Within each of these regions there are multiple data centres (availability zones) you can deploy your application across. This flexibility of deployment met our needs by providing a hosting platform that provided availability across multiple physical sites, giving us the resiliency we required for running our main production site.

The up-front investment required to automate the provisioning and deployments of environments is high, requiring investment in:

  • The DevOps toolchains such as Ansible, Chef, Puppet or SaltStack for environments provisioning and ongoing management
  • Creating deployment/release tools, especially if you want to use immutable servers
  • Security – ensuring access points to your environment are minimised and communication between nodes is restricted to the bare essentials

The end result for us looks something like this, where we have full site redundancy across multiple data centres and are located within AWS’s Sydney region.


Phase 3 - AWS


If required a new copy of this environment could be brought up in a matter of minutes with our DevOps provisioning tools, should our AWS region fail, but for now it mostly meets our needs, and provides us with a great degree of flexibility going forwards.

AWS does provide Platform as a Service capabilities with it’s Elastic Beanstalk offering, however we wanted the flexibility to manage our own servers and support non-standard use cases such as hosting multiple sites over SSL on a single set of infrastructure, which does not play so well with Elastic Beanstalk.

They also provide OpWorks for managing cloud infrastructure, however, it does tie you to Chef which was less appealing for us compared with some of the other options out there.


Footnote – DNS Failover

One of the options that we looked at early on was DNS failover to provide resiliency between different hosting providers, should one of them fail. The issue with this approach is that most providers require you to work with IP addresses which is not feasible if you’re using a provider that only gives you a URL to point to.

Amazon’s Route 53 DNS record management service provides a failover mechanism with CNAME records, which we found was good for our use case.

Read More

Evolving Architecture

As a startup co-founder and CTO, I have a number of responsibilities for my business, to quote Eric Reis (http://www.startuplessonslearned.com/2008/09/what-does-startup-cto-actually-do.html), one of these is “platform selection and architectural design”.

I have invested very heavily in this area since starting my journey with Huffle, and this blog is really to capture my thoughts and decision process made along this journey.

As you’d expect my role is currently very varied – I come from a financial technology background, and right now my role is covering all sorts of things, including:

  • Strategy
  • Dev-ops
  • Model development/implementation
  • Platform architecture
  • Server side and front end development (full-stack appears to be the buzzword doing the rounds these days…)
  • UX/design

My posts will likely dip into all of the above, some will be ad-hoc notes, others more in depth discussing strategic directions we’ve taken along the route.

Read More