Predict future values recursively with an expanding window in R

Image Credits: Alessandro Oliverio from Pexels

Why bother forecasting global chip sales? Simply put, global chip sales exhibit strong correlation with the PHLX Semiconductor Sector ETF, which tracks the 30 largest semiconductor-related companies worldwide.

Thus, if we could forecast the trend of global chip sales accurately, we may be able to improve on buy/sell decisions with regards to the PHLX.

The long-term trend of the semiconductor industry appears to be headed upwards. But in the short run, it tends to be affected by fluctuating sales growth numbers and is cyclical.

A text-based analysis in R

Image Credits: Shi Min Teh (Unsplash)

Every year, around February, our Finance Minister will unveil the Singapore Budget. But the Budget Statement is often quite wordy, with the document spanning over 40–50 pages.

Furthermore, if we wanted to study how past trends have evolved over the years to understand the government’s position, we may have to manually read past Budget Statements, which is time-consuming.

Using R and a little dose of text-based analysis, we can: (1) efficiently obtain the main themes in Budget 2021, and (2) quickly compare against previous trends in past Budget Statements, to study if the government’s position has changed.

Specifically, by comparing…

A comparison with the established banks in South Korea

Image Credits: Mohamed Hassan from Pixabay

Since there has already been a plethora of articles published online on virtual banks ranging from their app store ratings to customer acquisition rate and satisfaction metrics, I wanted to focus on something a little different.

I thought it might be interesting to use classical financial metrics such as the Net Profit Margin, Net Interest Margin (NIM), Return on Assets (ROA) and Return on Equity (ROE), among other metrics to see how the virtual banks compare against the established banks in the context of South Korea.

The following are the 2 main questions I hope to address in this post:

Step-by-step guide to web-scraping with Rvest & RSelenium

Image Credits: David Kubovsky from Unsplash

Greetings! If you have made it here, I would assume that you may have read Part 1 of this 2-part series where we explored the use of eXtreme Gradient Boosting (XGBoost) to predict public housing (HDB) price in Singapore.

This post provides a step-by-step guide on how we derived the data (used in Part 1) from the property website using web-scraping techniques.

Web-scraping Ethics
Before we proceed any further, it is always prudent to check first if we are allowed to web-scrape from any website. …

An eXtreme Gradient Boosting analysis in R

Image Credits: Koon Boon Goh from Pixabay

Recently, a relative of mine decided to go house-hunting in the HDB resale market and would like to know the “fair value” of a modest 3-room flat. Intrigued, I wanted to see if I could answer her question, and study if it is possible to predict housing price in the HDB resale market.

Research Questions

(1) What is the median price of a 3-room HDB resale flat?
(2) What are the important variables in explaining HDB resale price?
(3) How accurately can we predict the price of HDB resale flats given a set of characteristics about the property?


Using web-scraping techniques, I…

A web-scraping analysis with R

Image Credits: André François McKenzie, from Unsplash

Since 2015, there has been an exponential increase in FinTech private funding flows into Singapore. According to Accenture, Singapore’s FinTech industry received a record high of USD 861 million in funding from private investors in 2019, which cemented its reputation as the 5th largest FinTech market in the Asia-Pacific region.

Given the massive inflow of private funds into Singapore’s FinTech industry, it might be noteworthy to study if the fledgling industry has made gains in terms of firm formation and employment growth. What types of FinTech firms have set up their businesses on Singapore’s shores? What financial activity do they…

Image Credits: Johan Van Wambeke, from Unsplash

This post is more of an academic time series forecasting exercise, intended to illustrate ARIMA modelling with seasonality and an exogenous regressor.

Here, I attempt to forecast Singapore’s monthly visitors by air from Indonesia. For the purpose of this exercise, I used data from January 1998 to June 2013. The dataset can be downloaded from CEIC database (ID: 36590201 | SR Code: SR552981).

At a glance, the series does not look stationary! Some form of seasonal fluctuations also appear to be present in the data. Taking the natural logarithm can help to stabilize the series.

An Analysis Based on 19 Years of Past Budget Speeches With R.

Image Credits: Kirill Petropavlov, from Unsplash

About a year ago, my university (NTU) invited our Finance Minister Mr. Heng Swee Keat for the Ministerial Forum, which was attended by about 700 students and guests.

One student posed this interesting question to the Finance Minister: “is Singapore turning into a support state?” He found that the word “support” appeared very frequently in the past yearly budget speeches.

Spurred by that student’s question, I decided to perform an analysis on budget speeches across 19 years, from 2002 to 2020, to explore if his claim was true. …

Cheong Wei Si

R Enthusiast | Data Science | Forecasting | Machine Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store