Exponentially Logarithmic – Lifting the Lid on some Bloom Filter Derivations

In one of my previous posts on Bloom Filters, I stated the expressions for both the False Positive rate, ie $$p \approx \left( 1 – e^{\frac{-kn}{m}} \right)^{k}$$ and the optimal number of hash functions, ie $$k = \frac{m}{n} \ln 2$$ In this post I will detail the derivations of both expressions. Derivation of the False…

Bridging the Graph

I’ve been lucky enough to have worked on many interesting and challenging projects throughout my career, especially in the space of detecting “bad” people. As one can imagine, much focus is placed on detecting persons of interest in the government sector. After all, you can’t ensure national security, for instance, without being able to detect…

In Bloom – Nirvana?

I recently discussed how Confidential Computing allows us to analyse and use data without actually seeing it. A key component of Confidential Computing is Privacy-Preserving Record Linkage (PPRL), and one of the most widely used techniques within PPRL is the Bloom Filter (BF), which is the focus of this post. The aim is to give an overview of…

Interview with Felipe Flores (Founder of Data Futurology)

Felipe Flores is the founder and podcast host of Data Futurology, a podcast targeted at helping Data Scientists become successful leaders. As the former Head of Data Science at ANZ bank, and a former consultant, he shares some incredible insights, and offers valuable advice, to Data Scientists at any stage of their career. You recently…

The Red Pill or the Blue Pill? Machine Learning vs Statistical Modelling

Over the years I’ve helped a number of organisations, both large and small – public and private, build up their Data Science capabilities, and derive value from data using various analytical techniques. However, one key concern I’ve had a number of times is the confusion that can exist between means and ends ie solutions searching…

Interview with Felicity Splatt (Data Scientist/Manager at PwC)

Felicity Splatt is an experienced Data Scientist and currently manages a Data Science team at PwC. She has extensive experience across both the public and private domains, and has multiple degrees including a PhD in Quantum Computing! In this interview, she discusses the transitions from academia to the public sector, then from the public sector…

Interview with Ian Hansel (Director of Verge Labs)

Ian Hansel is the Director of the AI company Verge Labs, and is an experience Data Scientist and Data Science Instructor. In this interview, he shares some fantastic insights and advice on getting started in Data Science and building a successful career. As a Director of Verge Labs, can you please tell us a little…