Exchanging files between systems has been around for a while and to my opinion it will not to disappear soon. The reasons that this type of data exchange is still around could be related to:
CSV files do not escape on the fact that our data need is growing, thus so are CSV files. Due to that the processing of these files take more and more time and it happens nowadays more and more that IT departments ran out of their available daily batch time for loading and processing data.
Once the files are loaded the database knows how to handle the large volumes of data using parallelism and compression. Query execution times of 1 second on hundreds of millions of rows is rather common on a FastTrack Data Warehouse solution and this is in sharp contrast with the loading of a large CSV file, which can take several hours.
More and more our customers contact us with the question how to handle the problem of loading large files. This paper describe a general solution developed for a relative normal production environment with one ETL server using Integration Services and one SQL Server Fast Track Data Warehouse server.
I have put everything in a easy to read document that you can download here.
- Security rules prevent to connect directly to the source system
- The source system is outside the organization’s network so it is not possible to connect directly
- There is no compatible data connector for the source system
- etc.
CSV files do not escape on the fact that our data need is growing, thus so are CSV files. Due to that the processing of these files take more and more time and it happens nowadays more and more that IT departments ran out of their available daily batch time for loading and processing data.
Once the files are loaded the database knows how to handle the large volumes of data using parallelism and compression. Query execution times of 1 second on hundreds of millions of rows is rather common on a FastTrack Data Warehouse solution and this is in sharp contrast with the loading of a large CSV file, which can take several hours.
More and more our customers contact us with the question how to handle the problem of loading large files. This paper describe a general solution developed for a relative normal production environment with one ETL server using Integration Services and one SQL Server Fast Track Data Warehouse server.
I have put everything in a easy to read document that you can download here.
Comments
Post a Comment