Friday, August 15, 2014

Free Convert for Excel files (.xls to .xlsx)

I was pleasantly surprised that Excel 2010 has a command line tool that will convert .xls to .xlsx files. Here is how you use it.

In this example we are converting a file called MyFile.xls in the temp directory to MyFle.xlsx in the same temp directory. I didn't have any luck getting this to work without using absolute path.

  1. Open a command prompt 
  2. Execute the following (after adjusting it to meet your specific environment:
    "C:\Program Files\Microsoft Office\Office14\excelcnv.exe" -oice "C:\temp\MyFile.xls" "C:\temp\MyFile.xlsx"

Since this is command line driven you can script many files easily.

FOR %%X in (".\*.xls") DO IF NOT %%~xX == .xlsx echo Converting "%%~dpnxX"  & "C:\Program Files\Microsoft Office\Office14\excelcnv.exe"  -nme -oice "%%~dpnxX" "%%~dpnX.xlsx" 

WARNING: For some reason Excel may tell you that it didn't launch correctly and ask you if you want to run it in safemode. I have no clue as to why this is, but clicking No each time seems to work fine. It will do this for each file.

Friday, July 18, 2014

Remove alpha characters from string using SQL

If you have a string that has both numbers and alpha characters in it and want to remove all letters A-Z then this is a simple function that you can use on a column in SQL Server.

create function RemoveAlphas(@Text as nvarchar(255))
returns nvarchar(255)
Declare @Result as nvarchar(255)
Set @Result = @Text
Set @Result = Replace(@Result, 'A', '')
Set @Result = Replace(@Result, 'B', '')
Set @Result = Replace(@Result, 'C', '')
Set @Result = Replace(@Result, 'D', '')
Set @Result = Replace(@Result, 'E', '')
Set @Result = Replace(@Result, 'F', '')
Set @Result = Replace(@Result, 'G', '')
Set @Result = Replace(@Result, 'H', '')
Set @Result = Replace(@Result, 'I', '')
Set @Result = Replace(@Result, 'J', '')
Set @Result = Replace(@Result, 'K', '')
Set @Result = Replace(@Result, 'L', '')
Set @Result = Replace(@Result, 'M', '')
Set @Result = Replace(@Result, 'N', '')
Set @Result = Replace(@Result, 'O', '')
Set @Result = Replace(@Result, 'P', '')
Set @Result = Replace(@Result, 'Q', '')
Set @Result = Replace(@Result, 'R', '')
Set @Result = Replace(@Result, 'S', '')
Set @Result = Replace(@Result, 'T', '')
Set @Result = Replace(@Result, 'U', '')
Set @Result = Replace(@Result, 'V', '')
Set @Result = Replace(@Result, 'W', '')
Set @Result = Replace(@Result, 'X', '')
Set @Result = Replace(@Result, 'Y', '')
Set @Result = Replace(@Result, 'Z', '')
return @Result


select dbo.RemoveAlphas('123abc456DEF')

Returns 123456

Wednesday, July 9, 2014

Using T-SQL to format date as yyyy-mm-dd

Surprisingly, MS SQL Server doesn't provide custom date formatting. Instead you need to use one of their existing formats or use C# to implement it, but the later seems a bit overkill for our purposes here. The formats are defined here. Below are some ways to get a datetime or date column to print out in the yyyy-mm-dd format.

Method 1
This is simple and straight forward. Interestingly, I don't see it defined here, but it works.

Since it is not documented it is up to you if you want to use it or not. It has been around for many years, but it is unknown if it will be there in the future. Use this option at your own risk.

In this example, the length of 20 is used to show it doesn't matter, but any size could be used since it actually gives us the format we are looking for.


That will give you 2014-07-09.

NOTE: If you would like explore other undocumented formats, check this page out.

Method 2

A clever way is to use the 126 format which has hours, minutes, seconds, etc in it and just take the first 10 characters which is in the format yyyy-mm-dd. We could get a substring, but there really isn't a need since it will be implicitly truncated to 10 characters when we use char(10) or varchar(10) as our datatype we are converting to.

SELECT CONVERT(char(10), GetDate(),126)

That will give you 2014-07-09.

Method 2
If you have slightly different format requirement such as slashes or
select Replace(convert(nvarchar(10), GETDATE(), 102), '.', '-')

This works very simply because 102 is defined in the format and I am just replacing the periods with dashes. Pretty simple and effective.

Method 3
If you want it in the format yyyymmdd (no dashes) then you can use the 112 format as shown below.

select convert(nvarchar(10), GETDATE(), 112)

That will give you 20140709.

Wednesday, July 2, 2014

Get a list of tables in SQL Server that don't have primary keys defined

In SQL Server it is a best practice for all tables to have primary keys defined. A primary key is really a constraint. Below is a query to get a list of tables and the name of the primary key contstraint that is associated with that table. If the CONSTRAINT_NAME column is null then it doesn't have a primary key defined.

order by TABLE_NAME

You can add a where clause such as

where c.CONSTRAINT_TYPE is null 

to filter the results to just the tables that don't have a primary key.

Find blank rows in a SQL Server Database

After importing data into a SQL Server database there are sometimes blank rows that get created depending on what your data source looks like. Often when using Excel as a data source extra rows will be created with all blank values. Since the table doesn't by default have a primary key all columns can be null. Find what tables have blank rows and then deleting them can be time consuming. The script here will make this much easier.

I have used this script successfully on my databases, but please, please, please make a backup of your database BEFORE you execute the following since it can affect all your tables. I am of course not responsible for any data loss caused by this script. 

Executing the script below does NOT actually do the deletes. You will still need to copy and paste the generated SQL into SSMS and execute it. I highly suggest you read the generated SQL to make sure it is doing what you want it to before you do the final execution of the generated sql.

create table #BlankRowCounts(TableName nvarchar(255), NumBlankRows int)
Declare @SQL as nvarchar(MAX)
    @SQL = ISNULL(@SQL + ' union ' , '')
    + 'select ' +
'''' + TABLE_NAME + ''' as TableName, ' +
'COUNT(1) as NumBlankRows' +
' from ' +
'[' + TABLE_NAME + ']' +
' where ' +
dbo.GetColumnList(TABLE_NAME, 1, ' is null and ') + ' is null'

insert into #BlankRowCounts(TableName, NumBlankRows)
exec sp_executesql @SQL

select TableName, NumBlankRows,
'select ' +
'''' + TableName + ''' as TableName, ' +
dbo.GetColumnList(TableName, 1, ', ') + 
' from ' +
'[' + TableName + ']' +
' where ' +
dbo.GetColumnList(TableName, 1, ' is null and ') + ' is null' as SelectStmt,
'delete from ' +
'[' + TableName + ']' +
' where ' +
dbo.GetColumnList(TableName, 1, ' is null and ') + ' is null' as DeleteStmt
order by NumBlankRows desc
drop table #BlankRowCounts

You will also need to get the code for GetColumnList() function here.

The results of this SQL are simple. There are four columns:
  • TableName - The table for which the statements will affect
  • NumBlankRows - The number of rows in the table (see TableName) that have all blank columns
  • SelectStmt - The select statement you can copy and paste into SSMS to actually see for yourself that the columns are null. You don't have to execute these, but they are here to convince yourself that the data is blank.
  • DeleteStmt - The delete statement you can copy and paste into SSMS to actually do the deleting of the rows that have all the columns as null. I highly recommend reading this BEFORE you execute it. Also, consider backing up your data if the data is important to you.

Get Comma Separated List of Columns for a Table using T-SQL

Using T-SQL (Microsoft SQL Server) you can get a list of columns (delimited by commas or other delimiter of choice) for a given table using one of the function below. There are two ways to call it. Either one words, but it is up to your personal preferences and also how safe the column names are as to which function you use.

If ever in doubt, pass a 1 for @IncludeBrackets is the safest because it puts all the column names in square brackets. This allows column names to have spaces and other special characters that would not normally be allowed. This often happens when importing data from Excel and using the default names for the columns. This is because it uses the column headings in Excel which typically have spaces in them because they are meant to be human readable.

With that said, if I created the columns I always use Just alphanumeric characters and no spaces, etc so my column names are known to be safe. In this scenario, I personally feel it is easier to read the column names without the brackets so in this case I pass 0 for the @IncludeBracket parameter.

The first parameter is simply the table name.

Basic Usage

To use the function on a table called Person do the following.

select dbo.GetColumnList('Person', 1, ', ')
sample results: [FirstName], [LastName], [Phone]
select dbo.GetColumnList('Person', 0, ', ')

sample results: FirstName, LastName, Phone

Function Definition (Code)

Here is the code to create the SQL function

create function GetColumnList(@TableName as nvarchar(255), @IncludeBrackets as bit, @Delimiter as nvarchar(500))
returns nvarchar(max)
Declare @ColumnList as nvarchar(MAX)
Declare @BeginningBracket as nvarchar(1)
Declare @EndingBracket as nvarchar(1)

if @IncludeBrackets = 1
SET @BeginningBracket = '['
SET @EndingBracket = ']'
SET @BeginningBracket = ''
SET @EndingBracket = ''

@ColumnList = ISNULL(@ColumnList + @Delimiter, '')
+ @BeginningBracket + COLUMN_NAME + @EndingBracket
where TABLE_NAME = @TableName

return @ColumnList

Advances Usage

If you want to do the same thing, but for all tables in your database you can do something like the following. You can of course add a where clause, etc to select just the sames you want as well.

dbo.GetColumnList(TABLE_NAME, 1, ', ')

Wednesday, April 16, 2014

FREE or Open Source or Inexpensive options to compare data (in tables) in SQL Server


Visual Studio 2012 (an some earlier editions) include Schema comparison for SQL Server. This is NOT what I am writing about here today. Assuming you have used this tool to make tables, etc the same you may have a need like I do when moving data from dev to production databases or the reverse. There are several ways to go about this.

FREE - SQL Server Data Tools 

Probably the best place to start is SQL Server Data Tools which is available from Microsoft. It includes among other things the ability to compare data in SQL Server tables.It is available for Visual Studio 2010 and newer. There was an option in some editions of Visual Studio 2010, but not in Visual Studio 2012. To get the functionality in Visual Studio 2012 you need the SQL Server Data Tools to be installed. Once you have it installed you will have functionality very similar to what was available in VS 2010 or the RedGate product. Here is a direct link for the download of the English ISO. One of the nice things about this option is that it is well integrated into Visual Studio 2012 and uses the same source and destination configurations as the SQL Schema Comparison that is built into VS 2012.You can also select what tables you want to compare, what columns in the tables, if you want source or destination records, etc. It will just to the update for you or you can have it generate the SQL Script that you can manually. It gives you a nice visual representation of the differences and let's you select the rows you want to change. It seems to be pretty fast. The generated SQL script even disables constraints as needed. It also seems to handle nulls properly. This is a very nice option for free!

It appears it can be called from the command line as well, but I have not tried it.

Here is the blog for the SSDT team.

FREE - tablediff.exe

IMHO, this may be the best choice for scripted options. Believe it or not tablediff.exe is a utility that comes with SQL Sever 2005 or greater. I believe this is the tool that SQL Server uses when replicating tables, though that is just what I read from someone else. It will tell you on a row by row and column by column basis what is different. It will even generate the SQL scripts needed to make the destination table look like the source table. As far as I know you cannot download it separately. However, it is installed when you install SQL Sever 2005 or newer and you choose SQL Server Replication feature. In SQL Server 2008 R2 it is included by default, but I'm not sure about the other versions. On my machine it was located at C:\Program Files\Microsoft SQL Server\100\COM\tablediff.exe. Once you find it you can type tablediff.exe -? for the options or refer here to the documentation. The parameters are pretty well documented and easy to follow to I won't go into all the options, but here is an example of how you would generate a change script (SQL) and see what the differences are.

C:\Program Files\Microsoft SQL Server\100\COM>tablediff.exe -sourceserver MySrcServer-sourcedatabase MyDevDB -sourceschema dbo -sourcetable Person -sourceuser User1 -sourcepassword User1Pwd -destinationserver MyDestServer -destinationdatabase MyProdDB -destinat
ionschema dbo -destinationtable Person -destinationuser User1 -destinationpassword User1Pwd -c -o c:\temp\diff.txt -f c:\temp\diffgen.sql

This will generate two files. diff.txt which will have the differences, and diffgen.sql which will be the SQL script you can execute to make the destination table the same as the source table.

Be careful, the SQL will also generate delete statements for your destination table. This may or may not be what you want so just be aware. I recommend backing up your destination table before doing this operation.

Also, it doesn't appear to generate correct scripts for null. It put null in single-quotes. This can be changed easily with a search and replace though.

Keep in mind this is per table. If you have lots of tables and you want them all to be updated it could be a done also, but it a bit tedious. However, the nice thing about this tool is since it is command line once you have it setup you can run it again and again with little to no effort.

If you want you can even download a GUI for the command line tool. NOTE: You still need the command line tool. I was not able to get it to work with SQL Server 2008 R2, but you can decompile it and fix it I'm sure. Does it work for anyone else? All it does is exactly what you do on the command line so I'm not sure I really see the benefit if you are going to want to run it more than once.

Inexpensive - SQLDBTools

If I have a little money to spend on the solution. SQLDBTools costs $60 at seems to do pretty much what the RedGate products do, but for a fraction of the cost. It does do schema and data comparisons which is nice to have it all in one tool. I does visually let you see the differences and it also generates the change script. I have not tried this product, but it looks like the best product for the money (if you are going to pay for a solution). Given that SQL Server Data Tools now exists I don't really see the advantage or reason to pay for this tool.

FREE - Linked Server

This is a less desirable option because it doesn't do anything automatically for you. After you have a linked server you can do queries to see the differences between tables. This does nothing for automatically generating the update scripts. This solution is good for analysis only.

Inexpensive - Beyond Compare

Beyond Compare does a nice job of visually showing the differences in two files. It doesn't know anything about SQL or databases, but as long as you use SSMS to export the data to CSV, Excel, or tabular data Beyond Compare will show you the differences in an Excel like manner. Again, this does nothing for generating the change script and could be slow for very large tables.This is best for comparing two adhoc queries in my opinion.

Additional Info

I found this site that has a pretty exhaustive list of tools for SQL Server that may be useful.