First, let’s answer the basic question: What is a data warehouse? A data warehouse is a SQL database built for the asset manager to store the important data and documents manufactured at their offices for analytics and distribution. If there’s any tangible evidence an asset manager can use to prove who they are and what they do, it’s in the data.
Assembling and storing the data is the first task. And that data should be stored in a SQL database so you can access as many periods of data as needed—monthly, quarterly, yearly. If you use excel workbooks to store the important data, how do you compare one quarter to the next, one month to the next, one year to the next? You would have to open several workbooks. (Don’t get us wrong. We think excel is a wonderful tool, but not a wonderful storage device.) Think about it. Registration documents alone are required to be stored over a series of years due to regulations. Then, there are compliance rules that demand storage of quantitative data for a [long] period as well. Just like the medical field must keep their files, and lawyers must keep their files, investment managers must keep their files too. So where are you going to keep all this stuff? Most managers are using workbooks, their network drives, the cloud, a drop box…
There’s a multitude of ways to corral the cats and get this done, but inevitably, the best way to do this is to build your own SQL database. Easy enough, right? Not really. Many asset managers have tried to do that, and it’s very, very expensive. Not only is it costly to build, but it’s extraordinarily expensive to maintain because the demand in the world of data is constantly evolving. So, again, to answer the question “What is a data warehouse?”: it is a SQL database used by the asset manager to assemble and store their data so the assembly and management of that data can occur. The data warehouse is used to store everything: private data only used by the firm itself, for internal analytics; public data that must be secure and disseminated as needed; then there are documents, collateral, registrations, etc. A data warehouse for the asset manager to assemble and manage all their documents is a crucial tool to have. [To note, we have it to offer for our clients.]
As we’ve stated, an asset manager needs a data warehouse. Whatever the manager wants to show the world, to prove to the world what they do, it’s going to be in the data. The longer the history of the data, the more the data warehouse is necessary. If you’ve only been around a year, your track record is only a year, and you only have three people in the firm, you might be able to survive on excel workbooks. But not very long because as soon as you’ve had four quarters of data, you’ll have to open four workbooks to compare one quarter of data to the next.
So, no matter what size your firm is, or how long you’ve been in business, you will need a data warehouse to store your data, access your data, manage your data, and actively work with your data. How else are you going to do it? Are you going to create copy & paste exportation of data for your many purposes? A tedious job that allows plenty of room for errors. We have a saying here “If one human is touching data, you better have another three humans to throw at it.” Why? Because humans make errors. Most of the data from the data warehouse can be automated to be imported and exported. The data warehouse itself should have tools embedded in it to catch and find errors, outliers, etc. So why do you need a data warehouse? Because where else are you going to store your multitude of data documents? You can have one data warehouse that does everything for you, or you can have several gimmicky sources. Can you survive with an abundance of gimmicky sources to manage your data? Yes. But why would you if you can have access to your own data warehouse where all the data and all the documents are stored in one place?
Now you might be asking “What warehouse(s) do I use? A single warehouse? Multiple warehouses? Whose warehouse(s)?” We highly recommend using one warehouse that can store everything you need as an investment manager: both private and public data, including narratives, DDQs and RFPs, fact sheets, registrations, marketing collateral, etc. Having a consolidated warehouse for each of your many uses is ideal. We’ve been asked many times by managers who have hired us and initiated the consolidated data warehouse we built for them, “Can you now populate my warehouse with our historical data? Can you go get the data from someone else?” And of course, we can, unless the data belongs to someone else. When you give your data to a public or private database, you let go of ownership, thus control over that data, and now it belongs to them. That’s why having a proprietary database of your own, a data warehouse of your own, makes you the owner of the data and no one else.
Can you source the data from the firm’s vendors? Yes, you can get the data from your portfolio management system that’s connected to your custodians and pull data from your other in- or out-sourced systems. Yes, your warehouse can connect to all your data sources. But one of the things to keep in mind is you not only need to bring the data in, but you also need to connect the data out. Both importing and exporting of data can easily be done, but you have to have someone managing that SQL database—the data warehouse. [By the way, that’s what we are most proficient in doing: providing a data warehouse, and even better, a data warehouse that’s evolving and keeping up with the demands of data.] The demands of data are not decreasing; as a matter of fact, they’re increasing quickly.
Once you’ve decided to store your data and documents in a warehouse, you’ve now entered the realm of deciding who is responsible for holding the keys to the warehouse. While the asset manager does put data in the warehouse, the asset manager typically has an army of people contributing data, depending on the size of the firm. Some smaller firms may only have one person to manage the data, so that person is probably entering all the data solo. But most firms are getting data from a mass of sources. For instance, their personnel profiles are coming from their HR department; their narratives are coming from their marketing department or the portfolio management team themselves; the fees and the bios are coming from the administrators; the performance, holdings, and AUM may be coming from custodians or a plethora of other places that the firm is getting their data from. So even though the asset manager is the manufacturer of data, there are several players either outsourced or in-house that are contributing to the 15+ datasets that every investment profile is comprised of.
While a complete investment profile is data that makes up a specific firm, product, and vehicle, that same data is coming from multiple people and sources. You may have a performance person, a holdings person, an AUM person, and/or a narratives person… Now, who in the firm is holding the keys and contributing data into the warehouse? We would venture to say boldly, everyone. Everyone in the firm should have some level of access to the warehouse, even if they are on a view-only basis or limited to what they can view (i.e., private company data may have C-suite access only). But the entire firm has the use and purpose to assemble data in this warehouse, view data in this warehouse, and export data from this warehouse. And the data is quite active; data never sits idle. Perhaps data from 10 years ago may seem idle, but it’s being used today as well. So, if data is actively being contributed, assembled, managed, distributed, published, and used, you will probably find every person in the firm having to touch some form of data for the firm. Thus, a data warehouse should have the security, permissions, and activity tracking capabilities to ensure every person that steps foot over the threshold and into that data warehouse can perform as much or little activity as their role necessitates.
The most danger we face in cyber security is monitoring if anyone has taken private data and made it public, and if anyone has hacked into the data warehouse to input data not specified by the firm. In other words, is the data contained in the warehouse manufactured by the firm directly, and is that data represented as intended? There is a multitude of cyber security checks, including ensuring that logins are secure, and every activity is tracked. There are no actions occurring, not even logging in, that are not recorded. There are extensive logs to monitor any breaches and any nefarious players trying to get into the data warehouse. Data imported, data housed, and data exported should be fully controlled by cyber security to ensure that the data the manager intends is truly the data that is captured and nothing else has infiltrated those walls.
In the end…
Data must be active and in motion, so you can attract new investors using your data for marketing and to retain existing investors who must see the data to maintain that investment with you. You wouldn’t invest with somebody who doesn’t disclose what they’re doing with your money. Data must be housed, and then must be distributed. Data is the holy grail of the firm. Of course, the firm’s most important asset is human resources: the people aggregating and showcasing the data. But then, what are you going to do with the abundance of data? Where are you going to put it? If you’re going to store and distribute data, you should have a house to do that in. You can have multiple houses if you want to DIY this, or you can subscribe to a data warehouse that is turn-key and ready to go for you to populate with your data, secure your users of that data, and then start mapping the distribution out of that warehouse to all the places that you need to distribute to. [In case you missed it, APX Stream does all that, and does it well.]