Monday, April 7th, 2008

Using the new and sadly broken Excel Worksheet 'CustomProperties' collection

I think the comment explains it all.
Cut for VBA geekines and trauma for VBA-Phobics... )
I can has aspirin nau?
(5 comments | Leave a comment)

Friday, July 6th, 2007

Tuning in, turning on, and (so far) not dropping out

This afternoon I am working away on some SQL, automating the process by pasting the existing CREATE TABLE scripts in Microsoft Excel and using some elementary string-handling to split the clauses and automate a job that is essentially a series of find-and-replace operations.

Oh go on... Show us some VBA... )

Being geeky, and somewhat bored, I've embedded a Media Player in a spreadsheet and linked to the BBC Wimbledon stream. The Ladies' semi-finals, I think, and a very pleasing distraction it is, too.

This evening I shall trundle to Darkest Somerset, for tomorrow's Dan Grade Group Course at the Ki Federation's dojo. Hopefully I'll be back at a civilised hour on Saturday...

(1 comment | Leave a comment)

Wednesday, June 6th, 2007

More on string-handling and VBA: matching words by their longest common string.


More thoughts on string-matching and measuring the similarity of words, phrases and addresses: this time, I'm looking at the Longest Common String approach, and a modified LCS algorithm that gives consistent results on addresses and sentences without the need for a separate scoring process that examines the word order.

It gets geeky from here on in... )


It's up to you to paste this into the fuzzy-matching VLookup functions I posted a few days ago. Let me know how you get on!


A parting shot for the uber-geeks... )
(2 comments | Leave a comment)

Wednesday, May 23rd, 2007

String-Matching in Excel: VLookup() with fuzzy-matching to get a 'closest match' result

Ever had to look up a name or an address in a list that doesn't quite match, so the standard Excel 'VLookup()' and 'Match()' functions don't help you?

Thanks to some suggestions made by friends on a previous post, I've been able to code a lookup function that returns the closest match in a list:


Screenshot: Excel Dialogue for a Fuzzy-Matching lookup function

Here's the details:

Public Function VFuzzyLookup(Lookup_Value As String, Table_Array As Variant)


Find the best match for a given string in column 1 of an array of data obtained from an Excel range
This is functionally similar to VLookup, but it returns the best match, not the first exact match
This function is not case-sensitive.


  • If your data quality is poor, you are advised to display the retrieved index value from column 1 and use the MatchWord function on this index value to reveal the fuzzy-matching 'score' and discard all results below a threshold score.
  • Use VFuzzyLookup_Phrase if you are trying to match a phrase or sentence (a sequence of words separated by spaces) as that function is faster, and has additional logic for the word order.
  • Use vFuzzyLookup_Address if you are looking up names and addresses: that function has additional logic to normalise the common abbreviations and word-order conventions used in British addresses.


Yes, you read that correctly: you have a best-match lookup, vFuzzyLookup_Address() for lists of names and addresses. It could do with some performance-tuning (I wouldn't try it on lists exceeding 1024 members if you're in any kind of hurry) but it's usable.

Here's the source code... )
There's a particular point about the Phrase-matching function that could probably be improved:

' For each word in Phrase 1, identify the closest matching in Phrase 2 and record its position.

This discussion is aimed at other programmers: feel fee to comment, but it is rather technical in nature... )


Free to use any or all of it: there are limitations to what it can do, but I think this is good enough for everyday use and it plugs an obvious gap in the standard Excel worksheet functions.

(14 comments | Leave a comment)

Thursday, May 17th, 2007

Arrays of checkboxes in an Excel spreadsheet

Ever had a range of cells holding Boolean ('True' of 'False') values that you wanted to edit quickly? It would look a bit like this... )
(4 comments | Leave a comment)

Thursday, May 10th, 2007

Programming question: how 'matched' is a matching string?


An interesting question - easy to answer if two strings "ABCD" and "ABCD" are identical, harder if you have two sets of names and addresses that almost match and need some kind of fuzzy-matching process with a threshold.

What algorithms and metrics are out there?

Which of them are practical in a VBA or VB.NET environment, where simplicity is essential and performance is an issue?

My efforts... )

How did you deal with this type of problem when it came up?
(8 comments | Leave a comment)

Monday, September 4th, 2006

Optimising the 'Find' method in a sorted ADO recordset


Another day, another dollar... Except that they only accrue interest on weekdays, and I can't pay in the cheque on a bank holiday. 

The point is that we need a quick way of knowing if a given day really is a business day, and to shift it forward (or back) automatically if it isn't. More importantly, we need to do so without mounting our own denial-of-service attack on the calendar database - because this is the kind of thing that appears in a spreadsheet with tens of thousands of payment dates, all of which will need to be checked. Most companies have to maintain a database of public holidays: here's an example for stock exchange trading and settlement holidays from an international stockbroker in Singapore*, and if you're going to use the sample code in this post, I would suggest that you grab it and save it into a database table.

This is what a 'Business Day' function should look like in Excel... )

A bit of backgound on the hows and whys of payment schedules and 'business day' calendars... )

This is where it starts getting difficult: I'm a VBA developer, and I don't have the sophisticated data structures available to a C++ developer. The data structures I can use are:

  • Arrays, which are fast (but require a massive coding overhead);
  • Collections, which are easy (but very, very slow);
  • And ADO recordsets, which come with built-in searching and filtering (but can be very slow indeed if you don't have access to the server-side 'seek' method).

This example retrieves a large recordset of all the calendars, filters it down to the currency or calendar group we want, and applies a neat optimisation to the ADO 'Find' method.

How the optimisation works: )

The code is hidden behing a cut tag because there's rather a lot of it... )

*Tullet's have published it freely - which makes them jolly nice chaps, because this information is tedious to collect and very rarely published.

(4 comments | Leave a comment)

Tuesday, August 15th, 2006

The End Times: another Excel annoyance:

Ever see a column of month-end payment days do this in a spreadsheet?

 31/08/2005
 30/11/2005
 28/02/2006
 28/05/2006
 28/08/2006
 28/11/2006
 28/02/2007
 28/05/2007
 28/08/2007
 28/11/2007
 28/02/2008

Sigh. All VBA developers eventually face the weary task of correcting the VBA.DateTime function library because of this loathsome miscoding by MicroSoft:

' Special handling required for adding months at EOM:
' VBA.DateAdd("m", 1, "28 Feb 2006") = 28/03/2006 (!)
' Business logic is ALWAYS that adding a month to EOM
' gives the end of the following month - 31 Mar 2006.


Here's my solution: I guess you've all got one of your own by now. The code is hidden behing a cut tag... )
(5 comments | Leave a comment)

Tuesday, January 24th, 2006

Recovering VBA components from corrupted Excel workbooks


This is a fairly reliable method for recovering VBA from corrupted workbooks, and it runs in all versions of Excel from 97 onwards. Of interest to Excel geeks ONLY! You have been warned. )

(Leave a comment)

Thursday, January 19th, 2006

Excel and VBA: detecting an array variant

This will return 0 for scalar variables and a positive integer for arrays:

    InStr(TypeName(varTest), "()")

This casts correctly to Boolean TRUE or FALSE and can be used directly in an IF... THEN clause.

Kludgy, but effective. Note that this detects empty arrays: their typename isn't "empty", it is Empty()

The correct approach is to use the native VBA function VarType(), which reads the numeric type constant. An integer is VarType 2 (vbInteger) and the vbArray constant (8192) is used in a bitwise operation to give 8192 for an array of integers: [vbInteger OR vbArray]

The 'correct' approach fails when you try to wrap this logic in a function, for reasons stated in the comments...


Public Function VariantIsArray(varTest As Variant) As Boolean

' Return TRUE if varTest is array
' FALSE for User-defined types

' Note that all ranges of two or more cells are cast to an array variant
' even when you omit the '.Value' property. All object types that expose
' a default property will pass it into our function's parameter, instead
' of a reference to the object itself. There can be no reliable and self
' -contained IsArray() function in VBA, because you have to run a prior
' test using IsObject() in the calling function.

' What, you read the documentation and believed that IsArray() works? Try
' it on an array of objects, a single-member array, and an empty variant.

VariantIsArray = VarType(varTest) And vbArray

End Function
(7 comments | Leave a comment)

Tuesday, December 20th, 2005

Rebuilding bloated and unstable Excel workbooks.


Yet another specialist post, intended for my own reference and for VBA developers. The main interest is the example code for handling merged cells in VBA; they are far more trouble than they are worth.

Actually, it might interest other programmers, too, as a kind of deterrent to ever working with Excel. Be afraid, be very, very afraid... So it's all behind a cut tag )

(Leave a comment)

Tuesday, December 13th, 2005

A 'Watched Ranges' VBA wrapper class for Excel.


More Excel stuff: this source code is probably of limited interest, even to other VBA developers.

This is particularly true of the real programmers in the HairyEars LJ Friends List, all of whom could code this up in a matter of minutes with a few dozen lines of C++. The post is very much for my own reference, and for the occasional visiting VBA developer. So it's all behind a cut tag )

(Leave a comment)

Thursday, December 8th, 2005

CRC32 and Adler-32 Cyclic Redundancy Codes for an Excel Range

With thanks to [info]naranek for pointing me in the right direction... And to [info]ciphergoth for recommending the Adler-32 algorithm.

This post is more for personal reference than actual bloggery, so the source code has been hidden under a 'cut' tag for extreme geekiness... )
(Leave a comment)

Wednesday, October 12th, 2005

Excel VBA Snippet: the Time zone adjuster


This is a code module that gains access to the Windows registry API and reads the 'locale' settings that encode Time Zone information. Obtaining information about timezones is made quite difficult in Windows; you can always tell the time here and the daylight-saving work is all done for you - why on earth would anyone want want to know exactly when this happens, or what the time is somewhere else? Look behind the cut... )

(Leave a comment)