Monday, October 31, 2016

Hadoop Tutorial - from Yahoo Developer Network

Looking for a good intro and tutorial for Hadoop.
Look no further:  Hadoop Tutorial at Yahoo!.

Why Yahoo!  - they were effectively the first major company to embrace and extend Hadoop, helping to make it into what it is today.

Overview of R versions: CRAN, MRO, and R Server

Wondering just how the different versions of R compare?  And now doubt you have many of these questions:
  1. Should I use: 
    1. pure open source community supported CRAN version, 
    2. enhanced Microsoft R Open, or 
    3. R Server
  2. Multi-threaded 
    1. which R option, if any, supports multi-threading
  3. In-memory constraint
    1. which R option, if any, supports big data, and breaks out of the in-memory road block

Frank Banin has written up an excellent summary that compares the various R versions currently available. You can find it on SQLServerCentral:  Advanced Analytics with R & SQL: Part I - R Distributions

Thursday, October 27, 2016

SSRS support for Power BI - in technical Preview!

If you love Power BI, but want a localized server option using SSRS, well its coming.

Announced on October 25, 2016 on the Power BI blog, they are releasing it into technical preview.   For this first review, you'll need an Azure account. No mention of when a local, on-premises preview will be available, but still this is great news.

See:  Technical Preview of Power BI reports in SQL Server Reporting Services

Video  - here's a video showing Power BI against SSRS 2016

Create a modern enterprise reporting and mobile BI solution with SQL Server 2016


Can I try it out on-premises?  


Ok, what you really want is to run a localized VM version.  Christopher Finlan has added a posting that outlines how you can download the VM from Azure and get it up and running as a localized VM. Find that post here:  How to run the Technical Preview of Power BI Reports in SQL Server Reporting Services on-prem using Hyper-V

Azure - Business Continuity with Geo-Replication

One great reason to move to Azure is the ease with which you can setup and manage disaster recovery (DR), high availability (HA) and your overall business continuity plan (BCP). And in April of 2016, Microsoft announced support for geo-replication for all service tiers.

Great, so where to get started?  SimpleTalk.com has posted a great beginner article that discusses Geo-Replication. You can think of Geo-Replication as either log-shipping or mirroring for the Azure SQL Database environment, but either way, one benefit is that the passive secondary databases have three distinct benefits:

  1. Readability!  the secondaries are readable.
  2. Multiple passive secondaries
  3. A possible lower cost that the primary
You can find Simple-Talk's report at SQL Database: How to Configure Active Geo-Replication.



Tuesday, October 25, 2016

Master Data Services - an overview of the mechanics

Looking for an overview of the mechanics of Master Data Services.  One that covers the internals, and other mechanical basics? Then check out this posting on Simple-Talk: Master Data Services – The Basics.


Warning for Custom Visualizations in Power BI can now be suppressed!

Kryptonite No More!

Early adopters of Power BI have long had to endure the pain of seeing a warning message whenever a user was reviewing a dashboard or report with a custom visual. Well, no more. Here is the official posting by Will Thompson (Product Manager, Power BI):


Hi everyone. We’ve added an option to turn off the warning for custom visuals. You’ll find it under the Security tab of the Options dialog. You can also remove unwanted visuals from your report by clicking the in the Visualizations pane and selecting ‘Delete a Custom Visual’.
This is great news.  Having a warning pop-up as one of your executives is looking over a Power BI dashboard is not what most of us want.  You can read more about it here:

Friday, October 21, 2016

Data Center Comics, by Diane Alber

Looking for just that right data center comic? We'll look no further.
You can find some great comics at the site http://www.kipandgary.com/.



Thursday, October 13, 2016

Analytic database comparison

We found an in-depth article that compares several of the leading analytic databases. So if you are in the market for an analytic database, this might be a good place to do some comparison shopping. We have personally taken a look at Exasol and found that it performs well. Find our postings about Exasol here:  Exasol postings at Realized Design.

The database engines compare include:





Below are the summary results from the web site. If you want the full details, you can find them here: [The site is in Russian, so you'll want to enable translation.]
Development → Comparison of analytical in-memory database.


Results reported are in seconds - lower is better:

Inquiry
Greenplum
Exasol
Clickhouse
Memsql
SAP Hana
Impala
N1
14
<1
-
108
6
78
N2
131
eleven
-
 -
127
Error
N3
67
85
 -
122
733
T1
14
1.8
64
70
20
100
T2
17
4.2
86
105
20
127
D1
1393
284 
-
45
1500
-
D2
 > 7200
1200
-
 > 7200
Error 


Clickhouse Benchmarks

On the clickhouse.yandex site, they list several benchmark tests comparing Clickhouse to several other analytic databases.  But - none of the queries JOIN another table.  So sadly, not a very useful comparison.  See:  https://clickhouse.yandex/benchmark.html

MemSQL vs. MySQL (w/ InnoDB)

Good read with thoughtful comments on comparing the two: https://dom.as/2012/06/26/memsql-rage/


EXASOL release version 6 and a Free Small Business Edition

This week, EXASOL released version 6 of their analytic database. We have had good experience with version 5, so we are expecting version 6 to continue to excel. While the latest version of EXASOL has numerous enhancements, the two we especially found interesting was the addition of a data virtualization framework and the rebranding of the Community Edition to the Free Small Business Edition.

Data Virtualization

Analytic databases like EXASOL serve as a double edge sword for massive amounts of data. Their scalability provide the ability to store large repositories of data -yet, for many analytic tasks, users require much smaller, more targeted data sets. The addition of the data virtualization framework just might be what is needed to segment data for targeted end users.

Free Small Business Edition

We suspect that is simply a legal rephrasing to allow organizations to feel they have the legal ability to run production systems using their Community Edition. To that end, we are pleased that this was done.

Other Items

  • Increased support for Hadoop
  • Improved connectivity
  • Expanded support for programming languages
  • Increased performance and scalability

You can read more about it here:

LinkedIn - Analyzing data fast - Sean Jackson
EXASOL Free Trail

Wednesday, October 5, 2016

Best Practice for PolyBase Table Location - Use a Folder

Use Folders!  For production, create a new folder for each file type.  Then point your LOCATION= parameter to just the folder, and not the specific file.

Why?  


1) Add more files to the directory, and Polybase External table will automagically read them.
2) Do INSERTS and UPDATES from PolyBase back to your files in Hadoop.
    ( See PolyBase - Insert data into a Hadoop Hue Directory ,
             PolyBase - Insert data into new Hadoop Directory    ).
3) It's cleaner.


Here is a typical data folder in Hortonworks:





And here is the corresponding Create External Table:

       
  CREATE EXTERNAL TABLE [dbo].AWDW_CSV_Sales_Date (
     ModelName  nvarchar(200) NULL ,
     ShipDate  datetime      NULL ,
     ExtendedAmt  money         NULL ,
     OrderQty  smallint   NULL 
 )
 WITH (LOCATION='/user/hue/AWDW2012_SalesData',
    DATA_SOURCE = hdp23 ,
    FILE_FORMAT = CSVfile  
    );       
 


Tuesday, October 4, 2016

SQL Server 2016 Version Differences

Updated Nov 16, 2016:

With the release of SP 1 for 2016, all things have changed, and in a good way!


  1. Many features only on Enterprise are now on all, or almost all editions
  2. they did a better job formatting the page, so go to:

    https://msdn.microsoft.com/en-us/library/cc645993.aspx

    to get all the updated features in Ent, Std, Web and Express.





Technet posted a detailed comparison of the different SQL Server 2016 versions (see below), dated May 26, 2016.  Alas, the web page was not well formatted. So, below is the same data (again as of May 26, 2016) but with improved formatting.

Source:  https://technet.microsoft.com/en-us/windows/cc645993(v=sql.90)
Naturally, reference Microsoft for any changes, adjustments or inaccuracies.


Cross-Box Scale Limits
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
Maximum compute capacity used by a single instance - SQL Server Database Engine1
Operating system maximum
Limited to lesser of 4 sockets or 24 cores
Limited to lesser of 4 sockets or 16 cores
Limited to lesser of 1 socket or 4 cores
Limited to lesser of 1 socket or 4 cores
Limited to lesser of 1 socket or 4 cores
Maximum compute capacity used by a single instance - Analysis Services or Reporting Services
Operating system maximum
Limited to lesser of 4 sockets or 24 cores
Limited to lesser of 4 sockets or 16 cores
Limited to lesser of 1 socket or 4 cores
Limited to lesser of 1 socket or 4 cores
Limited to lesser of 1 socket or 4 cores
Maximum memory utilized per instance of SQL Server Database Engine
Operating System Maximum
128 GB
64 GB
1 GB
1 GB
1 GB
Maximum memory utilized per instance of Analysis Services
Operating System Maximum
Tabular: 16 GB
N/A
N/A
N/A
N/A

MOLAP: 64 GB
Maximum memory utilized per instance of Reporting Services
Operating System Maximum
64 GB
64 GB
4 GB
N/A
N/A
Maximum relational database size
524 PB
524 PB
524 PB
10 GB
10 GB
10 GB
1. Enterprise Edition with Server + Client Access License (CAL) based licensing
 (not available for new agreements) is limited to a maximum of 20 cores per SQL Server instance.
There are no limits under the Core-based Server Licensing model. For more information,
see Compute Capacity Limits by Edition of SQL Server.
RDBMS High Availability
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
Server core support 1
Yes
Yes
Yes
Yes
Yes
Yes
Log shipping
Yes
Yes
Yes
No
No
No
Database mirroring
Yes
Yes
Witness only
Witness only
Witness only
Witness only

Full safety only
Backup compression
Yes
Yes




Database snapshot
Yes





Always On failover cluster instances
Yes
Yes






Number of nodes is the operating system maximum
Support for 2 nodes
Always On availability groups
Yes






Up to 8 secondary replicas, including 2 synchronous secondary replicas
Basic availability groups

Yes





Support for 2 nodes
Connection director
Yes





Online page and file restore
Yes





Online indexing
Yes





Online schema change
Yes





Fast recovery
Yes





Mirrored backups
Yes





Hot add memory and CPU
Yes





Database recovery advisor
Yes
Yes
Yes
Yes
Yes
Yes
Encrypted backup
Yes
Yes




Smart backup
Yes
Yes




1 For more information on installing SQL Server 2016 on Server Core, see Install SQL Server 2016 on Server Core.
RDBMS Scalability and Performance
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
Operational analytics
Yes





In-Memory Column Store
Yes





Stretch Database
Yes
Yes
Yes
Yes
Yes
Yes
Persistent Main Memory
Yes
Yes
Yes
Yes
Yes
Yes
Multi-instance support
50
50
50
50
50
50
Table and index partitioning
Yes





Data compression
Yes





Resource Governor
Yes





Partition Table Parallelism
Yes





Multiple Filestream containers
Yes





NUMA Aware and Large Page Memory and Buffer Array Allocation
Yes





Buffer Pool Extension
Yes
Yes




IO Resource Governance
Yes





In Memory OLTP
Yes





Delayed Durability
Yes
Yes
Yes
Yes
Yes
Yes
RDBMS Security
Feature
Enterprise
Standard
Web
Express
Express with Advanced Services
Express with Tools
Row-level security
Yes
Yes
No



Always Encrypted
Yes





Dynamic data masking
Yes
Yes




Basic auditing
Yes
Yes
Yes
Yes
Yes
Yes
Fine grained auditing
Yes





Transparent database encryption
Yes





Extensible key management
Yes





User-defined roles
Yes
Yes
Yes
Yes
Yes
Yes
Contained databases
Yes
Yes
Yes
Yes
Yes
Yes
Encryption for backups
Yes
Yes




Replication
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
Heterogeneous subscribers
Yes
Yes




Merge replication
Yes
Yes
Yes (Subscriber only)
Yes (Subscriber only)
Yes (Subscriber only)
Yes (Subscriber only)
Oracle publishing
Yes





Peer to peer transactional replication
Yes





Snapshot replication
Yes
Yes
Yes (Subscriber only)
Yes (Subscriber only)
Yes (Subscriber only)
Yes (Subscriber only)
SQL Server change tracking
Yes
Yes
Yes
Yes
Yes
Yes
Transactional replication
Yes
Yes
Yes (Subscriber only)
Yes (Subscriber only)
Yes (Subscriber only)
Yes (Subscriber only)
Transactional replication to Azure
Yes
Yes
Yes



Transactional replication updateable subscription
Yes





Management Tools
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
SQL Management Objects (SMO)
Yes
Yes
Yes
Yes
Yes
Yes
SQL Configuration Manager
Yes
Yes
Yes
Yes
Yes
Yes
SQL CMD (Command Prompt tool)
Yes
Yes
Yes
Yes
Yes
Yes
SQL Server Management Studio
Yes
Yes
Yes
Yes
Yes

Distributed Replay - Admin Tool
Yes
Yes
Yes
Yes
Yes

Distribute Replay - Client
Yes
Yes
Yes



Distributed Replay - Controller
Yes (Up to 16 clients)
Yes (1 client)
Yes (1 client)



SQL Profiler
Yes
Yes
No 1
No 1
No 1
No 1
SQL Server Agent
Yes
Yes
Yes



Microsoft System Center Operations Manager Management Pack
Yes
Yes
Yes



Database Tuning Advisor (DTA)
Yes
Yes 2
Yes 2



Deploy a SQL Server Database to a Windows Azure VM Wizard
Yes
Yes
Yes
Yes
Yes
Yes
SQL Server Data Files in Windows Azure
Yes
Yes
Yes
Yes
Yes
Yes







1 SQL Server Web, SQL Server Express, SQL Server Express with Tools, and SQL Server Express with Advanced Services can be profiled using SQL Server Standard and SQL Server Enterprise editions.













2 Tuning enabled only on Standard edition features






RDBMS Manageability
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
User instances



Yes
Yes
Yes
LocalDB



Yes
Yes

Dedicated admin connection
Yes
Yes
Yes
Yes with trace flag
Yes with trace flag
Yes with trace flag
PowerShell scripting support
Yes
Yes
Yes
Yes
Yes
Yes
SysPrep support 1
Yes
Yes
Yes
Yes
Yes
Yes
Support for data-tier application component operations - extract, deploy, upgrade, delete
Yes
Yes
Yes
Yes
Yes
Yes
Policy automation (check on schedule and change)
Yes
Yes
Yes



Performance data collector
Yes
Yes
Yes



Able to enroll as a managed instance in multi-instance management
Yes
Yes
Yes



Standard performance reports
Yes
Yes
Yes



Plan guides and plan freezing for plan guides
Yes
Yes
Yes



Direct query of indexed views (using NOEXPAND hint)
Yes
Yes
Yes



Automatic indexed views maintenance
Yes
Yes
Yes



Distributed partitioned views
Yes
Partial. Distributed partitioned views are not updatable
Partial. Distributed partitioned views are not updatable
Partial. Distributed partitioned views are not updatable
Partial. Distributed partitioned views are not updatable
Partial. Distributed partitioned views are not updatable
Parallel indexed operations
Yes





Automatic use of indexed view by query optimizer
Yes





Parallel consistency check
Yes





SQL Server Utility Control Point
Yes





Contained databases
Yes
Yes
Yes
Yes
Yes
Yes
Buffer pool extension
Yes
Yes




1 For more information, see Considerations for Installing SQL Server Using SysPrep.
Development Tools
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
Microsoft Visual Studio integration
Yes
Yes
Yes
Yes
Yes
Yes
Intellisense (Transact-SQL and MDX)
Yes
Yes
Yes
Yes
Yes
Yes
SQL Server Data Tools (SSDT)
Yes
Yes
Yes
Yes


SQL query edit and design tools
Yes
Yes




Version control support
Yes
Yes




MDX edit, debug, and design tools
Yes
Yes




Programmability
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
Basic R integration
Yes
Yes
Yes
Yes


Advanced R integration
Yes





R Server (Standalone)
Yes





Polybase compute node
Yes
Yes




Polybase head node
Yes





JSON
Yes
Yes
Yes
Yes
Yes
Yes
Query Store
Yes
Yes
Yes
Yes
Yes
Yes
Temporal
Yes
Yes
Yes
Yes
Yes
Yes
Common Language Runtime (CLR) Integration
Yes
Yes
Yes
Yes
Yes
Yes
Native XML support
Yes
Yes
Yes
Yes
Yes
Yes
XML indexing
Yes
Yes
Yes
Yes
Yes
Yes
MERGE & UPSERT capabilities
Yes
Yes
Yes
Yes
Yes
Yes
FILESTREAM support
Yes
Yes
Yes
Yes
Yes
Yes
FileTable
Yes
Yes
Yes
Yes
Yes
Yes
Date and Time datatypes
Yes
Yes
Yes
Yes
Yes
Yes
Internationalization support
Yes
Yes
Yes
Yes
Yes
Yes
Full-text and semantic search
Yes
Yes
Yes
Yes


Specification of language in query
Yes
Yes
Yes
Yes


Service Broker (messaging)
Yes
Yes
No (Client only)
No (Client only)
No (Client only)
No (Client only)
Transact-SQL endpoints
Yes
Yes
Yes



Data Warehouse
Feature
Enterprise
Standard
Web
Express with Advanced Services
Express with Tools
Express
Create cubes without a database
Yes
Yes




Auto-generate staging and data warehouse schema
Yes
Yes




Change data capture
Yes





Star join query optimizations
Yes





Scalable read-only Analysis Services configuration
Yes





Parallel query processing on partitioned tables and indexes
Yes





xVelocity memory optimized columnstore indexes
Yes





Global batch aggregation
Yes