Financial Forecasting and Analysis for LowWage Workers
Abstract.
Despite the plethora of financial services and products on the market nowadays, there is a lack of such services and products designed especially for the lowwage population. Approximately 30% of the U.S. working population engage in lowwage work, and many of them lead a paychecktopaycheck lifestyle. Financial planning advice needs to explicitly address their financial instability.
We propose a system of data mining techniques on smallscale transactions data to improve automatic and personalized financial planning advice to lowwage workers. We propose robust methods for accurate prediction of bank account balances and automatic extraction of recurring transactions and unexpected large expenses. We formulate a hybrid method consisting of historical data averaging and a regularized regression framework for prediction. To uncover recurring transactions, we use a heuristic approach that capitalizes on transaction descriptions. Our methods achieve higher performance compared to conventional approaches and stateoftheart predictive methods in real financial transactions data.
In collaboration with Neighborhood Trust Financial Partners, the proposed methods will upgrade the functionalities in WageGoal, Neighborhood Trust Financial Partners’ webbased application that provides budgeting and cash flow management services to a user base comprising mostly lowincome individuals. The proposed methods will therefore have a direct impact on the individuals who are or will be connected to the product.
1. Background and Motivation
Lowwage workers make up a large portion of the working population, and need the most help in financial planning. The U.S. Bureau of Labor Statistics defines lowwage work in three ways (Fusaro and Shaefer, 2016):

Wage ($9.25/hr) lifting a family of two above poverty line,

Wage ($10.75/hr) lifting a family of three above poverty line,

Wage ($13.50/hr) lifting a family of three to 125% of the poverty line,
where corresponding wages in the U.S. in 2013 are in parenthesis. Approximately 15,000,000 U.S. workers earned up to $9.25/hr and 36,000,000 earned up to $13.50/hr in 2013. These are equivalent to 12.63% and 29.54% of the U.S. working population respectively. To put the wages into perspective, $13.50/hr amounts to a monthly income of just over $2000, while the median rent for onebedroom apartments across 50 major U.S. cities is already $1200 (Insider, 2017b).
Many lowincome households live paychecktopaycheck, at risk of financial instability in the event of medical, job or other unforseen emergencies when they are forced to take up loans and in the process incur additional charges. Reference (Pew Charitable Trusts, 2016) reports that the largest U.S. banks charged $11.6 billion in overdraft and insufficient fund fees in 2015, of which a significant portion is attributed to the poor. To make matters worse, the average debit charge triggering such fees is $24, while the typical overdraft fee is $35.
The poor are affected by obstacles including high cost of personalized financial services, lack of suitable banking services (Barr, 2004), and hassle of seeking financial services (Mullainathan and Shafir, 2010). We hope to alleviate these obstacles through datadriven solutions. In this paper, we propose a system of data mining techniques that can be used to improve automatic and personalized financial planning advice to lowwage workers. We work with anonymized checking, savings, and credit card account transactions data. User identification information such as age, gender, and size of household is not used. We also work in the small data scenario where lowincome individuals may not have a long banking history due to the aforementioned obstacles.
This work is motivated by use cases for Neighborhood Trust Financial Partners’ WageGoal app which caters primarily to lowincome individuals (Neighborhood Trust Financial Partners, 2017). Figure 1 contains screenshots illustrating some current functionalities such as giving a cash flow snapshot of the user’s income, bills and expenses, and helping with cash flow management. We aim to improve the accuracy of the forecast and information extraction functionalities. As illustrated in Figure 2, our main contributions are to propose methods for:

Short and longterm prediction of bank account balances with improved accuracy;

Automatic extraction of recurring transactions and unexpected large expenses.
The functionalities inform users about their possible future spending behavior so that they have enough time to adjust and to save up for emergencies. The main difficulties of these tasks originate from the multifaceted nature of transactions data. A user’s spending depends on individual needs and historical spending, but can also exhibit patterns similar to other users. Moreover, salary, bills and other recurring transactions provide characteristic features of a user’s spending behavior, but these cyclic patterns can be noisy and inconsistent at times. Our methods address these difficulties by both effectively mining a user’s recurring transactions and borrowing strength from the spending patterns of other users. We tested on two real financial transactions datasets, one is a smaller dataset from actual WageGoal users, and the other is a larger publicly available dataset from PKDD’99 Discovery Challenge. Our methods achieve higher performance compared to conventional approaches and stateoftheart prediction methods on both datasets.
We differentiate the proposed functionalities from those already offered by personal finance apps on the market (Insider, 2017a). The apps mainly track cash flow and provide simple budgeting tools, such as calculating an average daily “spendable” amount based on a user’s income and saving goal. For lowwage workers whose bank balance can hover dangerously around zero, more accurate estimates and finergrained financial analysis are necessary.
2. Transactions Data: Overview
The WageGoal app collects users’ transactions through authorized accounts using a thirdparty service Plaid (Plaid, 2018). Each transaction is associated with an account ID, date, description or merchant name, amount and category. Table 1 shows a sample snippet of the transactions captured. There are a total of 11 categories, namely Bank Fees, Cash Advance, Community, Food and Drink, Healthcare, Interest, Payment, Recreation, Service, Shops, and Travel. Uncategorized transactions are labeled NA. In the WageGoal Dataset, 28% of the transactions are uncategorized. The category labels are provided through Plaid and we do not attempt to address the problem of missing labels in this paper. Daily account balances are retrospectively calculated from the current balance.

Date  Description  $  Category  
1  6/22/2016  IKEA  20  Shops  
2  6/22/2016  Target  10  NA  
1  6/24/2016  Starbucks  15  Food & Drink  
3  6/24/2016  Interest  0.01  Interest  
3  6/24/2016  Direct Deposit  1000  Transfer 
2.1. Initial Cluster Analysis
The WageGoal dataset, further described in Section 5.1.1, consists 19 users with approximately one year of transactions. Using Dynamic Time Warping (DTW) distance, we cluster the overall balance sequences for the last available month by hierarchical agglomerative clustering. Balance sequences are set to zeromean since the mean does not affect the spending pattern. We use window = 2 for DTW to allow for slight misalignments in the time series.
Setting the number of clusters to five, two clusters consist one user each. For the three other clusters, we plot their average categorywise spending in Figure 3, which shows a clear distinction between the less and more welloff users. The least welloff (blue) group spent less and also incurred more bank fees compared to the moderate (green) and the most welloff (yellow) groups. In the four categories not shown, the three groups spent similarly.
Temporal patterns in balance sequences can help to distinguish the users’ financial status. Hence, we devise our prediction method to borrow strength from the data of similar users.
3. Related Work
Most works on transactions data make use of RFM (Recency, Frequency and Monetary value) to define and analyze customer value (Birant, 2011; Khajvand et al., 2011; Chang and Tsai, 2011). However, these summary statistics mask too many details for our purpose. Instead, we devise data mining and time series techniques to extract more information from the raw data and also make use of similarities amongst users’ spending behaviors to effectively improve prediction.
3.1. Prediction
Twopart models are popular in modeling household finances, medical expenditure and other data with nonnegative values (Brown et al., 2015; Min and Agresti, 2002; Mullahy, 1998; Neelon et al., 2011). These models can be formulated to cluster the subjects such that each cluster receives different parameter values. In our application, we need to consider both positive and negative spending (i.e., income). Moreover, twopart models often use covariates as part of the binary and continuous component representations. Covariates include timerelated variables and also classmembership variables, such as gender and employee vs. dependency status in (Neelon et al., 2011). Such covariates need extra manual effort to construct or are simply not available when we work with anonymized data.
A traditional time series model is the ARMA (autoregressive moving average) which regresses the current variable value on past values and error terms (Shumway and Stoffer, 2006). Seasonal ARMA models have been used on periodic data such as traffic flow (Williams and A. Hoel, 2003). Since transactions data contain multiple seasonal components and these components may not have fixed periodicities as we explain in Section 3.2, a plain seasonal ARMA method is not suitable. Zhang (Zhang, 2003) uses a neural network to model residuals from the ARMA model, while the authors in (Gonzaga Baca Ruiz et al., 2016) directly use neural networks for energy consumption prediction. However, these approaches require large quantities of data to provide good predictions.
In other datadriven time series analysis literature, Taylor and Letham (Taylor and Letham, 2017) implements a regressionbased method that is fast and includes holiday effects, but it models only weekly seasonality and requires handcrafted variables to indicate holiday effects. Approaches reported in (Alvarez et al., 2011; L. Scott and R. Varian, 2014; Zhang et al., 2013) extract similar sequences in historical data to incorporate more diverse patterns. Authors of (Alvarez et al., 2011) and (Zhang et al., 2013) use KNN regression and provide predictions by taking unweighted or weighted averages of the samples immediately after the matched sequences, but these may not be robust enough against anomalous data. Such “anomalies” or spikes in expenditure are not uncommon in financial transactions. To induce sparsity on regression coefficients, (L. Scott and R. Varian, 2014) uses a spikeandslab prior. Markov Chain Monte Carlo methods are used to estimate the parameters in the full Bayesian model, but they take significant computation time to iteratively forecast multiple future days, one day at a time.
While the aforementioned methods are suitable for general “wellbehaved” time series, we incorporate design features suitable for transactions data, while working under the framework that no further data annotations should be required and that data may be limited at the early stages of user enrollment. We recognize that different bank account types and prediction horizons require different treatment, and hence propose a hybrid approach to address the different scenarios. One aspect of our approach works on the level of individual transactions and relies on extracted recurring transactions, while the other works from the holistic pointofview of the overall balance series, where we extract similar sequences and use a computationally efficient regularized regression scheme to penalize anomalous sequences. Another key innovation is that we align the extracted sequences using landmark transaction events before regressing, which is observed to increase prediction accuracy.
3.2. Finding recurring transactions
For time series applications, it is common practice to identify periodicities by transforming the series into the frequency domain. A LombScargle periodogram approach to treat missing values and unevenlyspaced time points in finding periodicity in gene expression patterns is proposed in (Glynn et al., 2006). There are also methods such as (Elfeky et al., 2005) which address the problem directly in the time domain. Although some of these methods are capable of detecting multiple periodicities, they require the period of each cyclic component to be consistent across time. In transactions data, there can be significant jitter in the periodicity for recurring transactions such as pays due to differences in the number of days each month and the presence of holidays. Moreover, just using numerical values is insufficient to identify recurring transactions since there is substantial noise, and this is especially so for recurring transactions with small dollar amounts. Manually constructing and maintaining a complete biller’s list for each user’s recurring transactions is ideal but tedious. Hence in this paper, we propose a procedure that automatically identifies possible recurring transactions. The procedure takes into account the inconsistent nature of transactions and better distinguish between transactions through their textual descriptions.
4. Methods and Technical Solutions
We propose methods that use transactions data as described in Section 2. The main challenges of working with transactions data versus conventional numerical time series are the presence of:

Text description for each transaction;

Multiple noisy and inconsistent periodic patterns;

Spikes in spending.
4.1. Prediction of account balances
We predict account balances up to 31 days ahead. This forecast horizon encompasses two semimonthly paydays to give users sufficient time and information to plan their finances. We propose a historical data averaging method HistAvg which is more suited for shortterm prediction and accounts with minimal transactions, and a regularized least squares method SubseqLS which is more suited for longterm prediction of accounts with distinct cyclic patterns. To effectively address the nuances in modeling different types of accounts, we further propose a hybrid method HistAvgSubseqLS where the first days are predicted by HistAvg and the next days are predicted by SubseqLS.
4.1.1. HistAvg
HistAvg predicts daily spending and is adapted from the current implementation in WageGoal. The original version uses a biller’s list to extract bills, while we use recurring transactions found by our proposed procedure in Section 4.2. Predicting spending using past three months’ transactions is performed by:

Removing recurring transactions;

Removing top 10% of transactions;

Calculating daily basic spending as the average amount spent daily according to the remaining transactions;

Estimating future spending as the sum of daily basic spending and any recurring transaction to fall on that day.
Top 10% of transactions are excluded from the calculation of the daily basic spending since these are typically onetime or rare purchases. Finally, the account balance is estimated by summing the previous day’s balance and the estimated spending on that day.
4.1.2. SubseqLS
We assume that a similar balance history implies a similar future save for some “anomalies”. This motivates us to make use of all available historical data across users for prediction.
SubseqLS predicts oneday ahead by first setting the target account’s balance sequence from the immediate past as the length query vector . It then selects balance sequences that are similar to in its first values from historical balances of all users. Finally, it determines the best weights for the sequences to match . We combine the M sequences linearly for simplicity of the model, but more flexible combinations are also possible in principle. Essentially, for account of user , we consider
for some transformation and estimate the coefficients for a good prediction performance. Weights determined in traditional KNN regression methods tend to overfit to the query and are not robust to “anomalies”, so we regularize the estimation to avoid this.
Let be the current date, and be the number of days to predict. We set so that is sufficiently long to capture most recurring transactions. To recap notations, is the query for some account of some user , and is the day ahead balance to be predicted. Each selected sequence is in length. We use DTW (Dynamic Time Warping) distance with window = 2 to measure sequence similarity to allow slight misalignments, and iteratively find each using the fast search in (Rakthanmanon et al., 2012). We then locally expand or contract the sequences to adjust for temporal variations through function , which aligns each to a template of payday events, since paydays mark the start of cyclic spending patterns. Payday estimation is described in Section 4.2. Denoting the aligned sequence as , the first subsequence with length is used to match , and the second with length is used for prediction. Note that we let the subscripts of match those of for ease of notation, which means that is not necessarily an observation at time but it is just matched to .
We outline SubseqLS algorithm for oneday ahead prediction below. All sequences mentioned are standardized.

Set query vector consisting user U, account A’s daily balance from to .

Find sequences of length most similar to in DTW distance in the first values.

Create a template of indicators marking user U’s paycheck deposits into account A between and , with magnitude being the paycheck values. Set to be the function that aligns any sequence to this template by DTW.

Align each such that .

Estimate and nonnegative coefficients which minimize the following objective
where , and obtain and .

Estimate .
Aligning in Step 4 is an essential adjustment for temporal variations of matched sequences. Figure 4 shows how this preserves the exact cyclic pattern of .
We set in Step 2 so that similarity matching is done on a sufficiently long sequence that captures at least one semimonthly pay period. We let in the objective in Step 5 to eliminate sequences in which do not consistently match . Matrix also penalizes based on anomalous predictions of each . For the weights , we use 1 if , 5 if and 10 if . This choice emphasizes more accurate estimation of the tail of since that is the closest to the start of prediction, and is found to work well empirically. SubsegLS forecasts the entire future period by iteratively predicting for one day at a time.
4.2. Extraction of recurring transactions and unexpected large expenses
We propose a procedure to automatically extract recurring transactions in each account. The identified recurring transactions are used to improve prediction accuracy as in Section 4.1, and are also directly displayed to remind users of upcoming bills they need to pay to avoid penalty charges.
Recurring transactions include bills and other periodic behavior such as semimonthly salary, monthly recurring transfer between accounts, and weekly grocery shopping. Recurring transactions are split into monthly, semimonthly, biweekly and weekly frequency. For a spending category C, for each transaction and frequency in question, we look for transactions with similar descriptions in the past few dates satisfying the frequency constraint. We illustrate the procedure for extracting monthly charges in Figure 5.
In this example, we start with a window of 7 days, obtain all transactions within, and call this . Then, we backtrack by 31 days and retrieve all the transactions 31 days before (in a window of 7 days) that have descriptions similar to those in . We repeat this procedure multiple times to ensure that the remaining transactions indeed have the desired frequency. In our application, we repeat this procedure till we obtain 4 windows of transactions. The transactions remaining in the last window are identified as recurring charges.
For monthly and semimonthly charges, we use a 7day window to accommodate differing lengths of months, and use a 2day window for weekly and biweekly charges to accommodate small shifts in spending due to holidays, etc. To compare transaction descriptions, we use the difflib module (dif, 2017) in Python which iteratively finds the longest contiguous matching subsequences excluding junk elements, with a similarity threshold ratio of to accommodate insignificant differences such as dates and reference numbers. Additional bills may be found through a biller’s list. In practice, we use both our method and the biller’s list to complement each other.
To predict the next occurrence of a monthly transaction, we estimate the date as the last observed date of the transaction plus one month, and the amount as the average of historical transactions. We do similarly for semimonthly, biweekly and weekly transactions.
We further use the recurring transactions identified to determine unexpected or anomalous large expenses. Such expenses can be displayed to each user to prepare them for otherwise unforeseen spending. We do the following steps on each user’s transactions:

Remove all recurring transactions;

Retain top 10% of remaining transactions;

Remove transactions with similar descriptions.
The extracted expenses are pooled across all users, and the list of expenses displayed to a user can be personalized depending on their characteristics (e.g., car owner, a person with children, etc.).
5. Empirical Evaluation
5.1. Data Description
We use data collected through WageGoal for evaluation. Since WageGoal is a relatively new app, the dataset is limited in terms of the number of users and the length of usage. We include an additional financial dataset from the PKDD’99 Discovery Challenge for the prediction task to show the performance of our proposed methods on both small and large datasets. We note that the PKDD’99 Dataset is not specific to lowwage workers.
5.1.1. WageGoal Dataset
This dataset is collected from 19 individuals, with approximately one year’s worth of financial transactions from June 21, 2016, to June 16, 2017, in checking, savings, and credit card accounts. There are a total of 52 accounts of which 16 are checking, 19 are savings, and 17 are other accounts including credit cards. Each line item in the data includes date, description, amount, category and final account balance as described in Section 2. The one year’s worth of data is split into a training period of nine months and a testing period of three months. In this dataset, all users have semimonthly income.
5.1.2. PKDD’99 Dataset
This is a publicly available dataset containing real anonymized bank transactions dating from January 1, 1993 to December 31, 1998 (Berka and Sochorova, 2018). We want to test our methods in the scenario where we have access to long historical data, and hence retains the 2263 accounts in this dataset which have at least four years of data. Accounts have an average of 52.367 transactions per year, with the minimum and maximum at 52.310 and 52.479 respectively. Due to the sparsity of transactions, we consider weekly instead of daily balances. The six years’ worth of data is split into a training period of fourandahalf years and a testing period of oneandahalf years. Each line item in the data includes date and amount. No information on description and category of transaction are provided, and hence this dataset is not used to evaluate any other task besides prediction.
5.2. Prediction of account balances
From the test set, 25 length31 sequences are randomly chosen for prediction. We compute two different error measures for evaluation:

MAE (Mean Absolute Error), the average absolute difference between the actual and the predicted account balance;

Average difference in dollar amounts between the true and predicted balances when the true balance becomes negative.
The point in time at which balances become negative is of special interest because accounts will be charged penalty fees from that point onwards. We use only the first error measure on the PKDD’99 dataset. The second error measure is not applicable because true account balances in the PKDD’99 dataset are unknown and we arbitrarily set all account balances to start at 0. To calculate the error measures, we scale all accounts to have variance equal to 100 so that they contribute approximately equally.
We compared the performance of the individual methods HistAvg and SubseqLS, the hybrid HistAvgSubseqLS, as well as Prophet, ARMA, NearestNeighbor and KNN averaging. Prophet is a stateoftheart forecasting method from Facebook (Taylor and Letham, 2017) that uses a regression model to fit a linear trend, and incorporates weekly seasonality and holiday effects by marking them through indicator variables. We used paydays in lieu of holiday effects. ARMA is a wellestablished model for stationary stochastic processes (Shumway and Stoffer, 2006), and we implemented ARMA with parameters found through the statsmodels module using default arguments (sta, 2017). KNearest Neighbor is a popular nonparametric approach that is flexible and applied widely in diverse domains such as traffic flow and energy (Alvarez et al., 2011; Zhang et al., 2013). In NearestNeighbor, only the top matched sequence to the query is used by directly taking the value immediately after the match as the onedayahead prediction, and in KNN averaging, the top 10 matched sequences are used and the onedayahead prediction is the average of the values immediately after all 10 matches.
5.2.1. Prediction with WageGoal Dataset
The WageGoal Dataset is split into two types of accounts, those with paycheck income transactions (20 accounts), and those without (32 accounts). The former demonstrates more pronounced cyclic pattern as in Figure 7 and 7. Hence, the two types of accounts need different treatments.
The training set is used to optimize the number of matches and penalty parameter in SubseqLS by crossvalidation and also to determine the switching parameter for HistAvgSubseqLS. Search range for is in multiples of 5 between 5 and 25, values are between 0 and 10, and values are integers between 0 and . For paycheck accounts, and . For nonpaycheck accounts, and , meaning HistAvgSubseqLS reduces to HistAvg. Parameter is determined individually for each account and hence not reported here.
Table 2 shows the test results. HistAvgSubseqLS almost always performs the best, and is otherwise a close second. Figure 8 plots the average absolute difference between the actual and predicted account balance across time.
Balances in paycheck accounts tend to have pronounced semimonthly patterns starting with a sharp increase at payday followed by a decrease to approximately prepayday levels. These cyclic patterns sometimes perpetuate through historical data and are shared across users, which make sequencematching approaches suitable. SubseqLS benefited from regularization in this small dataset because not all top matches were close matches. KNN, in contrast, does not make this distinction and had average prediction error at least 50 times higher than all other methods. As seen from Figure 7(a), SubseqLS maintained almost consistent error across the entire 31day prediction period, while other methods had higher errors predicting further ahead. Due to the regression formulation, SubseqLS focuses on modeling the overall trend of the query vector instead of nextday prediction. Weights shift some of that focus to shortterm predictions, but it is difficult to tune exactly. As in Figure 7(a), HistAvg had the best shortterm predictions since its nextday prediction is designed to be close to the current day observation unless it has knowledge of an impending recurring charge. The switching parameter let HistAvgSubseqLS use HistAvg for the first 3 days before switching to SubseqLS, thereby attaining the lowest errors in both short and longterm.
Nonpaycheck accounts are mostly used for savings and occasional purchases. Common transactions include spare change saved through banks’ “keep the change” programs. The lack of prominent structures in the balance sequences resulted in poor matches found for sequencematching approaches such as NearestNeighbor, KNN and SubseqLS. As in Figure 7(b), HistAvg performed the best by making conservative predictions using a basic daily spending and recurring transactions. The switching parameter correctly picked to use HistAvg throughout the prediction period, such that HistAvgSubseqLS shared the same good performance as HistAvg.


5.2.2. Prediction with PKDD’99 Dataset
Balances exhibit cyclic patterns as in Figure 9, just like those in the WageGoal paycheck accounts. We apply the same SubseqLS parameter , and determine by crossvalidation. Since transaction descriptions and categories are not available, we cannot extract any recurring transaction for HistAvg, SubseqLS, and Prophet. We also do not implement the hybrid HistAvgSubseqLS, since HistAvg, being reliant on good estimates of recurring transactions, is heavily handicapped in this setup and is not expected to benefit the hybrid approach.
In each iteration of the experiment, we randomly select 52 out of the total 2263 accounts to perform prediction. We present test results across eight iterations in Table 3. Figure 10 plots the average absolute difference between the actual and predicted account balance across time. Solid lines are the mean across all iterations, and the shaded regions are the to percentile.
Results mirror those in Section 5.2.1 to conclude that SubseqLS performs the best in longterm predictions. The quality of SubseqLS predictions is consistent across the different scenarios and dataset sizes as presented by the two datasets. Furthermore, in this larger dataset, SubseqLS achieved low errors in shortterm predictions as well. We note that larger dataset sizes benefit sequencematching methods in general since more close matches can be found across time and users. For instance, KNN’s performance for the PKDD’99 Dataset is also much higher than that for the WageGoal Dataset.
Methods  Mean of MAE  Standard deviation of MAE 
HistAvg  12.245  0.578 
SubseqLS  7.017  0.277 
Prophet  7.794  0.311 
ARMA  7.967  0.455 
NearestNeighbor  9.243  0.674 
KNN  8.012  0.751 
5.3. Extraction of recurring transactions and unexpected large expenses
From the test set of the WageGoal Dataset, 25 dates are randomly picked. For each date, we extract recurring transactions prior to the date, and predict the dates for their next occurrence. A transaction is correctly extracted if the prediction is within 5 days of the true date. We evaluate the quality of the extraction procedure by:

Average number of true recurring transactions extracted per user;

Precision, i.e., proportion of recurring transactions extracted that is true;

Average error in days for the predicted dates at which recurring transactions next occur.
We compare the performance of our proposed procedure with an extraction method utilizing transaction descriptions and category labels. The competing method flags a transaction as recurring if the description contains the word ‘recurring’ or if the category label contains the following keywords: ‘bill pay’, ‘payroll’, ‘service  insurance’ and ‘service  subscription’. These rules are manually formulated based on close inspection of the dataset.
As seen in Table 4, the proposed method identified a larger number of true recurring transactions and with higher precision. Despite extracting more than double the number of true recurring transactions, the proposed method was only half a day worse on average in predicting the next transaction date. We combine the recurring transactions found by both methods above and use them to obtain a list of unexpected large expenses. Some examples of their approximate costs are shown in Table 5. We provided only a single value for each cost, but given observations of the expense from more users, a range or empirical distribution would be appropriate.
Method  # extracted  Precision 


Proposed  4.633  0.647  1.465  
Using labels  2.161  0.311  1.043 
Description  Cost ($) 
House cleaning service  350 
Car repair  500 
Student loan  5000 
Roofing  8000 
6. Significance and Impact
Our system will upgrade and replace existing models in WageGoal, and therefore have a direct and nearimmediate impact on the mostly lowincome individuals currently connected to the product. The enhancements will help users manage their volatile cash flow, capitalize on opportunities for debt reduction and savings, obtain greater overall financial health and stay out of poverty. Strategic partnerships will help Neighborhood Trust further penetrate relevant markets in the coming years, eventually reaching many tens of thousands of clients nationwide through its technology platforms.
Robust tracking systems are in place to measure the expected outcomes. Results will be shared as appropriate via Neighborhood Trust’s network of financial empowerment providers and other interested stakeholders, thereby increasing the potential visibility and scale of this promising approach.
7. Conclusion
We proposed a system of data mining techniques to predict and analyze spending behaviors in a small data scenario. Future work includes improving the predictive models by incorporating additional individual and grouplevel information, providing early and enhanced visibility for users into their financial health, and automatically generating personalized recommendations for improving financial stability. Depending on the feedback from the deployment, we will also consider other improvements as necessary.
References
 (1)
 sta (2017) accessed Jan 28, 2017. StatsModels: Statistics in Python. (accessed Jan 28, 2017).
 dif (2017) accessed Oct 9, 2017. difflib: Helpers for computing deltas. (accessed Oct 9, 2017).
 Alvarez et al. (2011) F. Martinez Alvarez, A. Troncoso, J. C. Riquelme, and J. S. Aguilar Ruiz. 2011. Energy Time Series Forecasting Based on Pattern Sequence Similarity. IEEE TKDE 23, 8 (Aug 2011), 1230–1243.
 Barr (2004) Michael S. Barr. 2004. Banking the Poor: Policies to Bring LowIncome Americans Into the Financial Mainstream. In Research Brief. The Brookings Institution.
 Berka and Sochorova (2018) Petr Berka and Marta Sochorova. 1999 (accessed Jan 28, 2018). PKDDâ99 Discovery Challenge  Guide to the Financial Data Set.
 Birant (2011) Derya Birant. 2011. Data Mining Using RFM Analysis. In KnowledgeOriented Applications in Data Mining, Kimito Funatsu (Ed.). InTech, Chapter 6, 91–208.
 Brown et al. (2015) Sarah Brown, Pulak Ghosh, Li Su, and Karl Taylor. 2015. Modelling household finances: A Bayesian approach to a multivariate twopart model. Journal of Empirical Finance 33, C (2015), 190–207.
 Chang and Tsai (2011) HuiChu Chang and HsiaoPing Tsai. 2011. Group RFM analysis as a novel framework to discover better customer consumption behavior. Expert Systems with Applications 38, 12 (2011), 14499 – 14513.
 Elfeky et al. (2005) Mohamed G. Elfeky, Walid G. Aref, and Ahmed K. Elmagarmid. 2005. Periodicity Detection in Time Series Databases. IEEE TKDE 17, 7 (July 2005), 875–887.
 Fusaro and Shaefer (2016) Vincent Fusaro and H. Shaefer. 2016. How should we define “lowwage” work? An analysis using the Current Population Survey. Monthly Labor Review, U.S. Bureau of Labor Statistics (10 2016).
 Glynn et al. (2006) Earl F. Glynn, Jie Chen, and Arcady R. Mushegian. 2006. Detecting Periodic Patterns in Unevenly Spaced Gene Expression Time Series Using Lomb–Scargle Periodograms. Bioinformatics 22, 3 (Feb. 2006), 310–316.
 Gonzaga Baca Ruiz et al. (2016) Luis Gonzaga Baca Ruiz, Manuel Cuéllar, Miguel CalvoFlores, and Maria del Carmen Pegalajar Jiménez. 2016. An Application of NonLinear Autoregressive Neural Networks to Predict Energy Consumption in Public Buildings. Energies 9 (08 2016), 684.
 Insider (2017b) Business Insider. 2016 (accessed Oct 4, 2017)b. Here’s what the typical onebedroom apartment costs in 50 US cities.
 Insider (2017a) Business Insider. 2017 (accessed Oct 4, 2017)a. The 5 best apps to help you manage your money.
 Khajvand et al. (2011) Mahboubeh Khajvand, Kiyana Zolfaghar, Sarah Ashoori, and Somayeh Alizadeh. 2011. Estimating customer lifetime value based on RFM analysis of customer purchase behavior: Case study. Procedia Computer Science 3 (2011), 57 – 63. World Conference on Information Technology.
 L. Scott and R. Varian (2014) Steven L. Scott and Hal R. Varian. 2014. Predicting the Present with Bayesian Structural Time Series. International Journal of Mathematical Modelling and Numerical Optimisation 5 (01 2014), 4 – 23.
 Min and Agresti (2002) Yongyi Min and Alan Agresti. 2002. Modeling Nonnegative Data with Clumping at Zero: A Survey. Journal of the Iranian Statistical Society 1 (2002).
 Mullahy (1998) John Mullahy. 1998. Much ado about two: reconsidering retransformation and the twopart model in health econometrics. Journal of Health Economics 17, 3 (1998).
 Mullainathan and Shafir (2010) Sendhil Mullainathan and Eldar Shafir. 2010. Savings Policy and Decisionmaking in LowIncome Households. In National Poverty Center Policy Briefs. University of Michigan, Chapter 24.
 Neelon et al. (2011) Brian Neelon, A. James O’Malley, and SharonLise T. Normand. 2011. A Bayesian TwoPart Latent Class Model for Longitudinal Medical Expenditure Data: Assessing the Impact of Mental Health and Substance Abuse Parity. Biometrics 67, 1 (2011), 280–289.
 Neighborhood Trust Financial Partners (2017) Neighborhood Trust Financial Partners. 2016 (accessed Sep 16, 2017). Neighborhood Trust Financial Partners And FlexWage Solutions Announce Partnership to Develop WageGoal.
 Pew Charitable Trusts (2016) Pew Charitable Trusts. 2016. Consumers Need Protection From Excessive Overdraft Costs. A brief from The Pew Charitable Trusts (12 2016).
 Plaid (2018) Plaid. 2018 (accessed Feb 7, 2018). https://plaid.com/.
 Rakthanmanon et al. (2012) Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, and Eamonn Keogh. 2012. Searching and Mining Trillions of Time Series Subsequences Under Dynamic Time Warping. In 18th ACM SIGKDD. 262–270.
 Shumway and Stoffer (2006) Robert Shumway and David Stoffer. 2006. Time Series Analysis and Its Applications  With R Examples (2 ed.). Springer, New York.
 Taylor and Letham (2017) Sean J Taylor and Benjamin Letham. 2017. Forecasting at Scale. PeerJ Preprints (2017).
 Williams and A. Hoel (2003) Billy Williams and Lester A. Hoel. 2003. Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results. Journal of Transportation Engineering 129 (11 2003), 664–672.
 Zhang et al. (2013) Lun Zhang, Qiuchen Liu, Wenchen Yang, Nai Wei, and Decun Dong. 2013. An Improved Knearest Neighbor Model for Shortterm Traffic Flow Prediction. Procedia  Social and Behavioral Sciences 96, Supplement C (2013), 653 – 662.
 Zhang (2003) Peter Zhang. 2003. Time Series Forecasting Using a Hybrid ARIMA and Neural Network Model. Neurocomputing. Neurocomputing 50 (01 2003), 159–175.