Upload
enrique-catala-banuls
View
53
Download
0
Embed Size (px)
Citation preview
#TechSum
[email protected] | www.solidq.com
@enriquecatala | www.enriquecatala.comMASTER BI EMPIEZA EN 14 DIAS!http://www.solidq.com/es/masterbi
#TechSum
Microsoft SQL Server
Entorno escalable SMP y
NUMA en cualquier
hardware
OnPremise y Azure IaaS
Ideal para datamarts desde
pequeños a “medianos”
Solo software
10s de TB
Microsoft Analytics
Platform System (APS)
Appliance para entornos
datawarehouse de alto
rendimiento MPP
Solo OnPremise
Ideal para alto rendimiento
y alta escalabilidad de DW
Applicance (SW y HW)
10s de TB – 6 PB (PDW)
24TB – 1.2 PB (Hadoop)
Microsoft Azure SQL
Data Warehouse
(SQLDW)
Cloud data warehouse para
entornosd de alto
rendimiento MPP
Solo cloud
Ideal para alto
rendimiento, alta
escalabilidad en cloud
Solución completa PaaS
10s de TB – PBs
#TechSum
PaaS
Escalable
horizontal
Arquitectura
MPP
Integrado
(SSDT, TSQL)
Hibrido
(Polybase)
Coste
efectivo
#TechSum
Load rate
• Records/sec
• Network+CPU
Scan/Aggregationrate
• Records/sec leyendo
datos
• Operacion compleja
• CPU + IO
Create Table As Select
rate
• Velocidad de copia
• CPU + IO +
NETWORK
#TechSum
DWU Max concurrent
queries
Concurrency slots
allocated
DW100 4 4
DW200 8 8
DW300 12 12
DW400 16 16
DW500 20 20
DW600 24 24
DW1000 32 40
DW1200 32 48
DW1500 32 60
DW2000 32 80
DW3000 32 120
DW6000 32 240
#TechSum
BCP
SSIS
Polybase
PolyBase BCP SQLBulkCopy/ADF SSIS
Load Rate
Rate increase as you increase DWU Yes No No No
Rate increase as you add concurrent load No Yes Yes Yes
FASTEST============>>>>>>>>>>SLOWEST
Pero si que se incrementa en
Polybase cuando tenemos muchos
ficheros por carpeta
#TechSum
ANSI joins on updatesANSI joins on deletesmerge statement
cursorsSELECT..INTOINSERT..EXEC
common table expressionsrecursive common tableexpressions (CTE)
• $partition function
• table variables
• table value parameters
• distributed transactions
• commit / rollback work
• save transaction
• execution contexts (EXECUTE AS)
• group by clause with rollup / cube / grouping sets options
• nesting levels beyond 8
• updating through views
• use of select for variable assignment
• no MAX data type for dynamic SQL strings
#TechSum
• E
• T
• L
ETL SERVER
DATA
WAREHOUSE
SERVER
EXTRACT
TRANSFORM
LOAD
ELT SERVER
DATA
WAREHOUSE
SERVER
EXTRACT
TRANSFORM
LOAD
ELTETL
IN-MEMORY
TRANSFORMS
FEWER CPUS
IN-DB
MORE CPUS
GOOD SORT
#TechSum
DBSource1DBSource1
DBSource2DBSource2
DBSource3DBSource3
SQLDW (PaaS)
5
23
Report2
24
3
7
1
Report1
Powerpivot
SSIS Incremental export
Onpremise CPD1 CLOUD (PaaS)
Storage (Azure)
POLYBASE
DBSource1
DBSource2
DBSource3
DB STG BI Database CUBEPreCleanData
DWHConsolidation
SSIS SSIS
SSAS processing
SSAS
Onpremise CPD1 OnPremise CPD2
Generadas por paquetes CDC
#TechSum
Pruebas y
mediciones
• Empezar en
400DTW
• Probar con
1000DTW
Cambio
ETL a ELT
• Fuera SSIS
• PowerBI/Reporting
Patron de
subida
• Marcas de cambios
• OnPrem a SQLDW
Carga de
datos
• Ajustes del modelo
#TechSum
Exportar datos a
CSV
• SSIS
• Automatizacion
• BIML
Subir CSVs a Azure
Storage
• AzCopy
• PowerShell
Cargar CSVs en
SQLDW
• Create Table
from Select As…
• Automatizacion
• t4dacfx2tsql
#TechSum
Bytes_per_row: 250#_rows: 60,000,000,000B_to_GB_factor: 1,000,000,000Distribution_count: 60Compression_factor: 5 (typical)Partition_count: 36 (monthly over three years)Max_rows_per_rowgroup
rows_per_distribution = #_rows / distribution_countrows_per_partition = rows_per_distribution / partition_countmax_rowgroups_per_partition = partition_count / max_rows_per_rowgroup
Uncompressed_table_size_b = bytes_per_row x # rowsUncompressed_table_size_gb = Uncompressed_table_size_b / B_to_GB_factorUncompressed_table_distribution_size_gb = uncompressed_table_size_gb/distribution_countCompressed_distribution_size_gb = Uncompressed_table_distribution_size_gb /compression_factorCompressed_partition_size_gb = Compressed_distribution_size_gb / partition_count
#TechSum
Pausa tu
SQLDW
Ahorrarás dinero
Puedes seguir
subiendo datos
con Polybase
DWU €/Hora €/Mes laboral €/Mes completo
100 1,02 € 179,52 € 734,40 €
200 2,04 € 359,04 € 1.468,80 €
300 3,06 € 538,56 € 2.203,20 €
400 4,08 € 718,08 € 2.937,60 €
500 5,10 € 897,60 € 3.672,00 €
600 6,12 € 1.077,12 € 4.406,40 €
1000 10,20 € 1.795,20 € 7.344,00 €
1200 12,24 € 2.154,24 € 8.812,80 €
1500 15,30 € 2.692,80 € 11.016,00 €
2000 20,40 € 3.590,40 € 14.688,00 €
3000 30,60 € 5.385,60 € 22.032,00 €
6000 61,21 € 10.772,96 € 44.071,20 €
#TechSum
Diseña pensando en distribucion de datos
Crea estadísticas tras CREATE TABLE
Actualiza estadísticas tras carga de datos
Particiona tus objetos
Comienza por DTW 300
#TechSum
Dinero
• No es un problema
• Prefiero parar y arrancar
BI
• Si no tengo solución BI
• O no quiero invertir más y quiero la via del hierro
Tiempo
• Hay mucho dato y necesito saltarme transformacion
• Ocasionalmente pido información que necesito
rapido
© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a
commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS
PRESENTATION.