Friday, December 23, 2016

A simple overview - random forests, decision trees, regression, oh my!

Looking for an easy summary overview for decision trees, regressions, support vectors, and bayes? Well, we found on at iocentral.io. You can find it at:  Want to know how to choose Machine Learning algorithm?  We've replicated the core parts below:


Decision Trees: Decision tree output is very easy to understand even for people from non-analytical background. It does not require any statistical knowledge to read and interpret them. Fastest way to identify most significant variables and relation between two or more variables. Decision Trees are excellent tools for helping you to choose between several courses of action. Most popular decision trees are CART, CHAID, and C4.5 etc.
In general, decision trees can be used in real-world applications such as:
·        Investment decisions
·        Customer churn
·        Banks loan defaulters
·        Build vs Buy decisions
·        Company mergers decisions
·        Sales lead qualifications
 
Logistic Regression: Logistic regression is a powerful statistical way of modeling a binomial outcome with one or more explanatory variables. It measures the relationship between the categorical dependent variable and one or more independent variables by estimating probabilities using a logistic function, which is the cumulative logistic distribution.
In general, regressions can be used in real-world applications such as:
·        Predicting the Customer Churn
·        Credit Scoring & Fraud Detection
·        Measuring the effectiveness of marketing campaigns
 
Support Vector Machines: Support Vector Machine (SVM) is a supervised machine learning technique that is widely used in pattern recognition and classification problems - when your data has exactly two classes.
In general, SVM can be used in real-world applications such as:
·        detecting persons with common diseases such as diabetes
·        hand-written character recognition
·        text categorization – news articles by topics
·        stock market price prediction
 
Naive Bayes: It is a classification technique based on Bayes’ theorem and very easy to build and particularly useful for very large data sets. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods. Naive Bayes is also a good choice when CPU and memory resources are a limiting factor
In general, Naive Bayes can be used in real-world applications such as:
·        Sentiment analysis and text classification
·        Recommendation systems like Netflix, Amazon
·        To mark an email as spam or not spam
·        Facebook like face recognition
 
Apriori: This algorithm generates association rules from a given data set. Association rule implies that if an item A occurs, then item B also occurs with a certain probability.
In general, Apriori can be used in real-world applications such as:
·        Market basket analysis like amazon - products purchased together
·        Auto complete functionality like Google to provide words which come together
·        Identify Drugs and their effects on patients
 
Random Forest: is an ensemble of decision trees. It can solve both regression and classification problems with large data sets. It also helps identify most significant variables from thousands of input variables.
In general, Random Forest can be used in real-world applications such as:
·        Predict patients for high risks
·        Predict parts failures in manufacturing
·        Predict loan defaulters
The most powerful form of machine learning being used today, is called “Deep Learning”.
In today’s Digital Transformation age, most businesses will tap into machine learning algorithms for their operational and customer-facing functions.
Source: http://www.iotcentral.io/blog/want-to-know-how-to-choose-machine-learning-algorithm

Tuesday, December 20, 2016

What's new in SQL Server 2016

James Serra posted a SlideShare covering what's new in SQL Server 2016, including the recently released SP1.  SP1, BTW, opened up just about all the features to all versions of SQL Server.

If you like slide shows that outline the latest with SQL Server, take a look at the SlideShare on LinkedIn.  You can find it here:

What's new in SQL Server 2016 - James Serra

Monday, December 19, 2016

Index usage stats for SQL Server

Need information on if your indexes are being used?  Greg Larsen has posted a query utilizing the DMV sys.dm_db_index_usage_stats.  You can find it here:  Which Indexes are not Used? - Greg Larsen

SQL Server v.Next CTP 1.1 released

Microsoft released the first update to the Linux v.Next version of SQL Server, officially lised as SQL Server v.Next Community Technology Preview (CTP) 1.1. From the announcement, the key database engine enhancements include:


Key SQL Server v.Next CTP 1.1 database engine enhancements include:

  • Language and performance enhancements to natively compiled T-SQL modules, including support for OPENJSON, FOR JSON, JSON built ins as well as memory-optimized tables support for computed columns.
  • Improved the performance of updates to non-clustered columnstore indexes in the case when the row is in the delta store. 
  • Batch mode queries now support "memory grant feedback loops,” which learn from memory used during query execution and adjusts on subsequent query executions; this can allow more queries to run on systems that are otherwise blocking on memory.
  • New T-SQL language features:
  • Introducing three new string functions: TRIM, CONCAT_WS, and TRANSLATE
  • BULK IMPORT supports CSV format and Azure Blob Storage as file source
  • STRING_AGG supports WITHIN GROUP (ORDER BY)s
  • Most of the improvements in SQL Server 2016 Service Pack 1 are now available in SQL Server on Linux, with the exception of a few features such as Polybase 
Source:  Microsoft community update email: Dec 19, 2016

How to Update -  See:    Update SQL Server on RHEL - CentOS

Update SQL Server on RHEL - CentOS

Sourced from the Microsoft support documents:

Notes:  


1) No restart/reboot after running the update command
2) Microsoft's posted command uses yum update, not yum upgrade.


    "In order to upgrade the mssql-server package, execute the following command:

   sudo yum update mssql-server

These commands will download the newest package and replace the binaries located under /opt/mssql/. The user generated databases and system databases will not be affected by this operation."


Source:

Monday, December 12, 2016

SSMS vNext and "Cannot connect to WMI provider"

Problem:  


After installing SQL Server Management Studio for vNext, the Configuration Manager no longer opens, with a message similar to the following:


             Cannot connect to WMI provider. You do not have permission or the server is                        unreachable. Note that you can only manage SQL Server 2005 and later servers with SQL Server Configuration Manager.
Invalid namespace [0x8004100e]



Steps to correct:



  1. Search for the file:  sqlmgmproviderxpsp2up.mof
    You will probably find it in the folder:

       C:\Program Files (x86)\Microsoft SQL Server\140\Shared

  1. Run the following command, adjusted using the file location from the search above.

    C:\WINDOWS\system32>mofcomp "%programfiles(x86)%\Microsoft SQL Server\140\Shared\sqlmgmproviderxpsp2up.mof"



Based on Microsoft support - kb966013:




Linux - SQL Server Service: Status - Stop - Start - Restart Commands

A listing of commands to check the status of the SQL Server service, along with stopping, starting and restarting the service (RHEL):


Status Check

                 systemctl status mssql-server -l

Start Service


                 sudo systemctl start mssql-server

Re-start Service


                 sudo systemctl restart mssql-server

Stop Service

                 sudo systemctl stop mssql-server



Thursday, December 8, 2016

Database Uptime vs. Aircraft Engine Maintenance

Brent Ozar has a great way to compare maintaining databases to maintaining aircraft engines. It goes like this:


"...if you have an airplane, it’s absolutely imperative that its engines not fail mid-flight. In order to accomplish that, you have to have regular downtime for mechanics to examine and replace parts – and that doesn’t happen up in the air. 

With Availability Groups, we’re lucky enough to be able to transfer our passengers databases from one airplane to another quickly – but we still have to have those other airplanes getting constant examinations and patches from mechanics." (Brent Ozar, December 7, 2016)


You can read the full posting at:  Availability Groups: More Planned Downtime for Less Unplanned Downtime

Tuesday, November 29, 2016

R - Powered Custom Visuals in Power BI

R - Powered custom visuals are now available on the Power BI  site. The custom visuals include those for correlation plots, decision trees, forecasting, clustering, association and spline.  You'll need to install R on your workstation (desktop). The best part is that users will not need to know R. The custom visuals do the work interacting with R.  You can find out more here:  Shape Power BI R visuals, without understanding R, and download the custom visuals here:  Power BI Custom Visuals.



Power BI - November 2016 update

Microsoft posted an announcement of its latest Power BI updates on YouTube (November 28, 2016). The twenty minute (20:44) presentation outlines improvements in four areas: Report View; Analytics; Data Connectors; Query Editing Improvements


Monday, November 28, 2016

SSMS & Linux - Use the Wizard to RESTORE Database

You can use SQL Server Management Studio (SSMS) to RESTORE a backup file to your Linux based SQL Server installation.  While the most control comes from T-SQL commands, we want to know just how far we can use the SSMS GUI and its related wizards.

Once connected, our Locate Backup File pop-up takes us to the C:\var\opt\mssql\data folder. Interestingly, on our Linux machine, the folder is simply addressed as:  /var/opt/mssql/data. No leading C:\, and the slashes are reversed.  The wizard pop-up looks like this:





For this preview release, the location wizard limits our access to various files and folders. It's not good form, but we'll move the file to the /data/ folder.


Copy / Move



Here we move our backup file to the folder where SSMS is expecting it to be on our Linux machine. For production, you really will not do this, but just so we can see our backup file display in our wizard, we'll place a copy of the backup file in the default data folder used by SSMS.

Using a terminal, use the copy command cp, to move the file:


But my Backup file is still not visible in the wizard!


Permissions.  If you drill down into the folders in Linux, we found that the files already present in the /data/ folder are owned by the user mssql.  Our recently copied backup file is NOT owned by mssql, and it not accessible to other users. So, our wizard cannot see the file. Options:

  1. Permissions.  Once we changed access permissions on our backup file, our wizard could access the backup file.
  2. Don't use the folder: /var/opt/mssql/data. This folder holds the live database files (.mdf, .ldf, etc) so it's best to create a new folder to hold your backup files. The location wizard will let you type in a different folder directory.




Change Permissions on Test.bak


If you need to change permissions on the file, you most likely will have to start the file viewer "Nautilus" as a root user.  In a terminal, 

  1. sudo su ( and enter root password)
  2. nautilus

This brings up the file viewer as root.  Navigate to /var/opt/mssql/data, and then adjust the file permissions:



Once done, you should be able to see your backup file in the Location wizard.







Tuesday, November 22, 2016

Permissions blocking SQL Server on Linux to open file

You're attempting to RESTORE a SQL Server database backup to your Linux installation and you get the message:

Msg 3201, Level 16, State 2, Line 17
Cannot open backup device 'C:\home\user\Downloads\AdventureWorks2012.bak'.
Operating system error 2(The system cannot find the file
specified.).
Msg 3013, Level 16, State 1, Line 17
RESTORE HEADERONLY is terminating abnormally.


Check Permissions


Chances are, file and folder permissons are blocking access to the file. 

In our example, we placed the backup file in our Download folder. Regardless of the folder used, you'll need to make sure SQL Server can access the file, and that includes access to the folder. In both our CentOS 7 (1511) and Ubuntu setups, we needed to open up access starting at the top folder - Home, open up access to the Downloads folder, and finally to the backup file itself.

  • Home folder           - Allow Others to Access files
  • Downloads folder  - Allow Others to Access files
  • backup file             - Allow Others to Access files
Once you have made these changes, SQL Server should be able to access the file successfully. Below is a sample where we changed permissions on the file. Folder permissions follow a similar approach.




Naturally, for production, you'll want to use a more sophisticated approach, and probably not use the Downloads folder of a user.  Something more along the lines of:

  $ sudo chown mssql /home/user/Downloads/AdventureWorks2012.bak

  $ sudo chgrp mssql /home/user/Downloads/AdventureWorks2012.bak



Also see how to RESTORE to SQL Server on Linux:  Restore a SQL Server database to Linux.


CentOS pre-built VirtualBox VM

We found a recent site with several pre-built CentOS virtual images. The site has both VirtualBox and VMware images, for not only CentOS, but also many other distro's.

You can find the CentOS images here:  https://www.osboxes.org/centos/

You can also find many at Virtualbox:  https://virtualboxes.org/images/centos/

CentOS 7 - install VirtualBox Guest Additions

With a fresh CentOS VM up and running in VirtualBox, our next step was to add in GuestAdditions. But, it failed.  Unlikc the install on Ubuntu, CentOS kicked back with this message:


       

The gcc utility was not found. If the following module compilation fails
then this could be the reason and you should try installing it.
The headers for the current running kernel were not found. If the following module compilation fails then this could be the reason. The missing package can be probably installed with yum install kernel-devel-3.10.0-327.el7.x86_64
Building the main Guest Additions module[FAILED] (Look at /var/log/vboxadd-install.log to find out what went wrong)Doing non-kernel setup of the Guest Additions[  OK  ]Press Return to close this window...^CSignal caught, cleaning upPress Return to close this window...

Fortunately, the answer has already been posted.
You can find it here:  How To Install VirtualBox 5 Additions on CentOS 7.

In summary, the steps were:
  1. $ sudo yum -y update
  2.    reboot
  3. $ sudo yum install epel-release
  4. $ sudo yum install dkms kernel-devel
  5.    install GuestAdditions from CD
    1. via auto-installer, or
    2. terminal
      1. $ cd /run/media/<yourid>/VBOXADDITIONS_x_x_x_xxxxx
      2. $ sudo ./VBoxLinuxAdditions.run
  6.   reboot


CentOS - settings to Connect automatically

Curiously, with a fresh install of CentOS (we're using v7 (1511) the network configuration does not automatically connect at startup.  We found two ways to change this. One, via the GNOME GUI interface, the other using a terminal.

Terminal


On the TechnologyTales.com website, we found a terminal setting adjustment. You can read about it here:  Automatically enabling your network connection at startup on CentOS 7.


Desktop Interface


While ideal for production, GUI's are great for one-off adjustments that don't require the user to know explicitly what command and its flags should be used. So for those users, here is an alternative.

Using the toolbar, top right, click on the down arrow, and then select the Wired section:


Next, select the Wired Settings section:


In the bottom right, click on the system/properties icon:


Finally, in the properties box, first, select the "Identity" menu item, and on this page, check the "connect automatically" box.









Monday, November 21, 2016

Restore a SQL Server database to Linux

For this posting, we have an on-premises virtual machine with SQL Server vNext on Ubuntu. SQL Server vNext is up and running, and we're using the SQL Server Management Studio for vNext to both backup our source database and then restore the database to Ubuntu.  Since all of the tasks were performed via T-SQL scripts, they could just as easily been run from SQurrill SQL, sqlcmd, or any other tool.
(  see Use SSMS to connect to Linux SQL Server v.Next
   and Use SQuirrel SQL to Connect to Linux SQL Server v.Next  )

Overview


Overall, the process was straight forward, especially since we were using SSMS. File and folder permissions was the one area that caused us the most problems.

First, back up the database


We used an older version of AdventureWorks for this task.  We explicitly used the NO_COMPRESSION option for this demo - to keep it simple.  

       
    BACKUP DATABASE [AdventureWorks2012] 
       TO  DISK = N'H:\Data\AdventureWorks2012.bak' 
      WITH NOFORMAT, NOINIT,  NAME = N'AdventureWorks2012-Full Database Backup'
        , SKIP, NOREWIND, NOUNLOAD, NO_COMPRESSION,  STATS = 10, CHECKSUM
    GO   
 

Next, change Permissions to Read-only


Copy/move the backup file to your Ubuntu virtual machine. To keep permissions simple, just copy the file to the Downloads folder on Ubuntu. Once the file has been moved, adjust the file permissions so that the file can be read.  For demo purposes, the quickest approach is to change the "Others" section and change Access to "Read-only".



File and folder permissions can fail the restore, so if you use another folder, just make sure all folder and file permissions allow access.

Test the file HEADER and FILELIST


To verify we could read the backup file, we ran both RESTORE HEADER and RESTORE FILELISTONLY.


       
      USE [master]
      RESTORE HEADERONLY
        FROM  DISK =  N'C:\home\user\Downloads\AdventureWorks2012.bak' 
 
      USE [master]
      RESTORE FILELISTONLY
       FROM  DISK = N'C:\home\user\Downloads\AdventureWorks2012.bak'

       
 

On one of our earlier failed restore attempts, it was clear there was a problem with the backup file. From further examination of the backup, it was clear the backup file had issues:





If you are not sure of the actual Ubuntu file location, use properties:



Swap backslash/forward slash


Notice the different slash orientation: 

Ubuntu:                 /home/user/Downloads/AdventureWorks2012.bak
SSMS:               C:\home\user\Downloads\AdventureWorks2012.bak



Restore Database to Linux


To control the process, we used the MOVE option, and reviewed all target folder permissions to ensure that our restore would be successful.

       
 USE [master]
 RESTORE DATABASE AdventureWorks2012 
  FROM  DISK = N'C:\home\user\Downloads\AdventureWorks2012.bak'
 
  WITH  FILE = 1
, MOVE N'AdventureWorks2012_Data' TO N'C:\media\user\Data\SQLServer\AdventureWorks2012_Data.mdf'   
, MOVE N'AdventureWorks2012_Log'  TO N'C:\media\user\Data\SQLServer\AdventureWorks2012_Log.ldf'   
, NOUNLOAD,  STATS = 5

GO
       
 






Use SQuirrel SQL to Connect to Linux SQL Server v.Next

Use your existing SQuirrel SQL setup to connect to SQL Server vNext for Linux. It's a standard connection using the jTDS Microsoft SQL Driver - just like any other SQL Server connection.

-- We also used this same driver to use Oracle's SQL Developer to connect to SQL Server vNext for Linux as well.

Here's the Alias configuration used to connect to SQL Server vNext on Linux:


Friday, November 18, 2016

Use SSMS to connect to Linux SQL Server v.Next

We've downloaded the v.Next preview release of SQL Server for Linux and it's time to connect and take a look.  Great, but the online documentation is focused on command line interaction using sqlcmd.  [ for more information on getting the SQL Server for Linux preview
- see SQL Server on Linux is now in public preview)

Command lines are great for production, but for investigation, a GUI interface is better. So, can we connect using SQL Server Management Studio?  The answer is yes, but with some qualifiers.


  1. Use a current production release of SSMS?

    We used a slightly outdated SSMS, version/release 13.0.15700.28. The current release is build number : 13.0.16000.28 (lised as release 16.5).  With our older version, we were able to make a connection using a standard connection using the IP for the server name along with SQL Server Authentication. While we could make a connection, we opted to switch to the new v.Next preview edition.
  2. Use a preview edition of SSMS, updated for SQL Server v.Next, version 17.0 RC1
    You can down a preview edition on the main SSMS download page: Download SQL Server Management Studio (SSMS)

Wednesday, November 16, 2016

CREATE OR ALTER - now in SQL Server 2016, SP1

Long overdue, but now, with the release of SP1 for SQL2016, you can now use CREATE OR ALTER for:


  • stored procedures
  • triggers
  • user defined functions
  • views

SQL Server 2016 - Enterprise features on most editions - via SP1

Announced today, SQL Server 2016 SP 1 has been released, and along with the expected fixes, they have basically upgraded the Standard, Web, Express and LocalDB versions to a near Enterprise level of features. Here is a nice graphic that is posted on the blog making the announcement:

Source:  blogs.msdn.microsoft.com/


What was not upgraded?  Scale and high availability.You can read all the details
at:        SQL Server 2016 Service Pack 1 (SP1) released !!!
plus:    Features Supported by the Editions of SQL Server 2016

SQL Server on Linux is now in public preview

Today, November 16, 2016,  Microsoft announced public preview availability of SQL Server on Linux. On the site, they have listed install instructions for the following distro's:


  • Red Hat Enterprise Linux 7.2   (we tested, it works with CentOS  [1])
  • Ubuntu Linux 16.04
  • SUSE Linux Enterprise Server v12 SP2 (listed as in-process )
Plus a bit of starter code. To get started, go to:



We've started to take a look.



You can also find a post on how to use SQL Server on Linux via Docker, along with VS and your Mac.  you can find it at SentryOne here:  VS Code on Mac meets SQL Server on Linux (in Docker)


[1]  CentOS release 7 (1511) is listed as based on RHEL 7.2.
      Reference: https://wiki.centos.org/Download.  No guarantees, but we tested it, and for us it works.


Thursday, November 10, 2016

Upload Power BI Desktop .pbix to SSRS 2016

With the preview release of SSRS 2016 with support for Power BI Desktop reports, we wanted to test out just how easy - or difficult it would be to actually upload a report to SSRS 2016 with support for Power BI .pbix reports. Actually, it is pretty easy.

Overview


While the process is easy to upload a Power BI Desktop .pbix report to SSRS 2016, reports are loaded using a browser connected to SSRS. At least for the Preview, you cannot use Power BI Desktop to upload or publish reports to SSRS. Since our immediate goal is to run a .pbix report on SSRS, this not really an issue. 

         See also: Setup SSRS 2016 with support for Power BI Desktop in VirtualBox for how to
         get your own copy as a virtual machine on Azure.

With our local virtual machine up and running, we wanted to be able to create a new Power BI Desktop report on our own workstation, and then upload the report to the preview SSRS server.  To do this we had to take a few extra steps to get our VM ready for external connections.

On the SSRS Server:


  1. Enable TCP/IP:  on our preview VM, use SQL Server Configuration Manager and then enable TCP/IP.  You'll need to stop/restart services.
  2. Open ports to allow external access to SQL Server.  See:  Script to open ports for SQL Server

In SSRS  (this is optional, depending on the user to log in)


  1. SSRS Preview:  On the desktop, open the link that takes you to the SSRS portal. Naviage to the security section, and add the user you will use to connect to SSRS, and enable the appropriate roles. (or add to the server Administrators group)
    Optional:  or, you can just connect remotely using the admin/pw for the server when you set it up - your choice.




Connect via Browser

Get the reporting server URL and connect using a browser on your remote workstation. You can do this on the VM, but we wanted to test our ability to upload a .pbix remotely. And while we typically use Chrome, for this, we used Edge. It is actually very fast, faster than Chrome, but does not yet have a Norton plug-in for web security.

You will get a challenge/response:


Connect using the admin/pw credentials for the server, or others you setup earlier.  You will get a page that looks like this:

Notice the Power BI Desktop Reports section at the bottom.



Upload a new Power BI Desktop Report

We created several new reports using Power BI Desktop on our workstation that used the databases on our SSRS 2016/Power BI VM. Now, it was time to upload them to SSRS.

1.  On the menu bar, locate the Upload button and select





2. Select your .pbix file:
This brings up a pop-up file selection window. Locate and select your .pbix file, and then select/click the Open button. Your .pbix file will go directly into the "Power BI Desktop Reports" section.





That is it.  Your Power BI Desktop Report will be available for review.
[the (PBID July 2016) just tells us that we built this report with the Power BI Desktop July 2016 release - Version: 2.37.4464.461 64-bit (July 2016).




Notes, warnings and other considerations

  1. Version of Power BI Desktop
  2. Folders
  3. What's missing

Version of Power BI Desktop.

For this preview, you will need to use either the September, 2016 version, or an earlier version. The preview uses version:  2.39.4526.801 64-bit (September, 2016).  The October version did not work for us. We used a July version of Power BI Desktop for our tests.  If you use a new version, you will get this error:



An error has occurred.  Something went wrong. Please Try again later.
If you try to open the .pbix file you created in a newer version on the VM using the  2.39.4526.801 64-bit (September, 2016) installed on the VM, you will get this message:



Unable to Open Document
.pbix is incompatible with your current version of Power BI Desktop
No worries, just use an older copy of Power BI Desktop - or use the one in the VM.  Longer term, we suspect that this will be resolved.


Custom Folders

If you want your .pbix file to reside in a custom folder, no problem. First navigate to your folder, and then start the upload process.  If the folder does not yet have a section labeled  "Power BI Desktop Reports" the upload process will first create one. Below is a new folder where the upload process automagically created the  "Power BI Desktop Reports" section.




What's Missing?

It's a preview, and a great start at that. Still, there are some items that are not in the SSRS preview. Let's take a look at the one's we have identified so far in our review:


  1. Dashboards - at least for now, SSRS only supports reports
  2. Natural Language Queries
  3. Quick Insights











Setup SSRS 2016 with support for Power BI Desktop in VirtualBox

SSRS 2016 supporting Power BI Desktop reports is now in preview on Azure. But for many of us, we'd rather be able to review this in our own virtual environment, and more specifically - VirtualBox. We'll now you can. 

Our starting point was a blog posting my Microsoft employee Christopher Finlan outlining the steps needed to setup this preview in a Hyper-V environment. A great start, but what we wanted was the ability to run it Virtual Box. Fortunately for us, running the downloaded VHD in VirtualBox is much easier than Hyper-V.

Also see:  Upload Power BI Desktop .pbix to SSRS 2016

Overview of Steps


  • In Azure, create a new SSRS-PowerBI preview virtual machine
  • Download the Azure VHD
  • Create a new VirtualBox VM - without storage/virtual harddisk
  • Update the VM settings and add the downloaded VHD
  • Start the VM, log in, and add VirtualBox Guest Additions
  • Delete your Azure virtual image and both storage items
    - or pay about $5.00 a day in storage costs.  



Detailed Steps


Create a new Azure SSRS 2016 with Power BI Desktop support.


Rather than repeat what has been done - you can just follow the steps outlined on MSDN.   Keep a record of both the admin login name and the password.


Download the VHD for you newly created Azure image


Again, just follow the steps outlined by Christopher Finlan on his posting. We plan to use VirtualBox, so when following Finlan's steps, you only need to download the VHD.  After that, you can stop. Using VirtualBox, these is no need to do the other steps he has outlined to get it to work with Hyper-V.  We used Microsoft's Azure Storage Explorer to great success.


Make a VHD backup.  Personal choice, especially if you plan to delete the one on Azure.

Create a new VirtualBox machine


Here, create a new VirtualBox machine without a virtual harddisk.  The operating system is Windows Server 2012 R2.

And then update the configuration settings for your new VM

Start with these settings
  1. General - use defaults
  2. System
    1. Motherboard:  
      1. memory:  8+ GB, 
      2. floppy:     disable
      3. others:      go with defaults
    2. Processor:          4+
    3. All others:          use defaults
  3. Display                         use defaults
  4. Storage
    1. move your downloaded VHD to your newly created VM folder. Not required, but it helps to keep the folders well organized
    2. attach VBoxGuestAdditons.iso to the CD/DVD
  5. Audio - defaults  (disabled)
  6. Network - we use the bridged Adapter
  7. Serial Ports - defaults  (disabled)
  8. USB - use defaults (disabled)
  9. Shared folders - we usually hold off on this until we install Guest Additions
  10. User Interface - defaults


Start up the VM in VirtualBox

On first start-up, it takes several minutes. Login with your pre-configured admin login and password. After login, we were greeted with a Swap-file warning. Review the paging file settings, and adjust as necessary.





Install VirtualBox Guest Additions

While not necessary, adding Guest Additions allows us to use shared folders. A very nice plus.


Delete your Azure VM and both storage files

When created, the Azure VM sets up two storage files. The smaller +/- 133 GB VHD file you download. The other one is over 1 Terabyte.  Naturally, you do not need to delete them, but we found that but left on Azure - even if it is turned off, it will run you about $5.00 a day in storage costs.  


More about SSRS and Power BI Desktop reports


Upload Power BI Desktop .pbix to SSRS 2016



Friday, November 4, 2016

What tool to learn first - data science and statistics

Not sure which one to learn first? It is a fair question. And while there are many to choose from, the most commonly used are R, python and SAS. Personally, I'd suggest R first.  It is free, is easy to learn the basics, and is just a good tool to know for data science use. And we've outlined some great starting resources to get you started:  Data Science Resources at Realized Design

If you'd like to read a well written, fairly comprehensive comparison between R, python and SAS, check out this posting:  R, Python or SAS: Which one should you learn first?

Monday, October 31, 2016

Hadoop Tutorial - from Yahoo Developer Network

Looking for a good intro and tutorial for Hadoop.
Look no further:  Hadoop Tutorial at Yahoo!.

Why Yahoo!  - they were effectively the first major company to embrace and extend Hadoop, helping to make it into what it is today.