top of page
  • Facebook
  • Twitter
  • Instagram

It is very common that the data purchased or collected is not in the structure you need for your project. At DP Financial, we can help to clean the data and organise it in the way you need. Also, it is possible that you need to merge multiple datasets and therefore it is critical to ensure the datasets are merged correctly.

 

At DP Financial, we are familiar with statistical software and various databases to ensure the data can be merged correctly and efficiently.

Data Cleaning

1

Case Study: improving data quality

Database Involved: provided by the client

​

Software: Stata, MS Excel

​

Service Provided:

​

A client collected a dataset for currency exposures faced by companies, the dataset consists of various currencies such as AUD, USD, NZD, CNY. One issue the client wanted to resolve was that the name of the currencies was recorded in various forms and sometimes with typos. For example, AUD was recorded as AUD, Australian dollar, Austalian dollar, Australia currency, etc. The data provided was recorded separately in excel spreadsheets for each company. To save processing time, we first used Stata to convert and merge the data of all companies, we then cleaned the data by assigning a consistent currency code to all variations of the same currency.

 

Another issue that needed addressing was that the same currency might be purchased and sold multiple times during the same reporting period via different contracts, and these contracts are which result in multiple observations for the same currency. Through our knowledge of financial market dynamics, we were able to identify any potential mistakes in the dataset and aggregate the contract values for the same currencies in the same report.

2

Case Study: merging datasets

Database Involved: CRSP, BoardEx, SDC, WorldScope, etc.

​

Software: SAS, MS Excel

​

Service Provided:

​

A client’s  project was related to cross-border M&As that involve multiple databases including CRSP, BoardEx, SDC, and WorldScope.

 

After a comprehensive discussion with the client, we collected the data based on the instructions provided, and merged the data using SAS based on common unique identifiers in each database, including CUSIP, SDC Deal Number, etc. Due to the inconsistency across databases, it was important that the datasets can be merged accurately without any duplications.

3

Case Study: cleaning a data set provided

Database Involved: provided by the client

​

Software: SAS, MS Excel

​

Service Provided:

​

A client purchased a data set regarding the miscellaneous expenses incurred by companies. Since the companies report their expenses in different ways, the same expense may be named or categorised differently. We assisted the client in cleaning the data by re-assigning these expenses into appropriate categories following a pre-defined rule.

Contact Us

Congratulations!

Your message has been submitted successfully!

© 2021 by DP Financial Group Pty Ltd          Melbourne, Australia        Email: yudong.zheng@dpfinancialgroup.com

bottom of page