rank trend

Regular Expressions Cookbook

by Jan Goyvaerts, and Steven Levithan
Released 2009-05-22
Read articles about Python
Buy it from AmazonNew for $29.69

20 Reviews

Sort by: Most Helpful ▲ Date Rating

4 stars At last a Use Case based RegEx Book

2009-06-07     15 of 15 found this review helpful

As much as I hate to admit it, regular expressions are hard for me. My need to use them is situation specific and I never really took the time to master them conceptually. So, when it comes time create one, I have to grope around to figure out how to meet the need at hand.

This book is really made for a person like me. The structure is problem-solution based. And, every problem is numbered in outline format. Thus, referencing back is an easy affair.

Want to know how to find bold text in an HTML file? This book will tell you how.

Want to learn how to split a sting using a regular expression? This book tells you how.

The book discusses solutions generally and in language specifics. It supports C#, Java, Javascript, Ruby, Python, PHP, Perl, VB.NET.... the entire cast of the usual characters. (No pun intended.)

The writing is clear. You can take things in a bit at a time. And, that some of the problems use those 'hard to get concepts', the topical discussions actually teach you the difficult concepts in a manner that is pretty easy to understand. Sometimes you might have to go over a section of few times to get full understanding. But the review is not a chore.

This is a good, useful book. It's helping me to become a better engineer. And believe me, I need all the help that I can get! :)

5 stars Goes further and deeper than many tutorials on regular expressions

2009-06-21     11 of 11 found this review helpful

This excellent book goes further and deeper than many tutorials on regular expressions. You might be surprised with some of the things you'll learn from reading it.

Unlike many cookbooks, this one doesn't dive into the recipes right away. I thought this was a good call because regular expressions are a specialized topic, and most developers don't work with regular expressions on a daily basis so they probably have to be reminded of the building block concepts and syntax, and get prepared for a discussion of more advanced features. Chapter One provides a list of recommended tools for working with regular expressions. Chapter 2 is a concise but very thorough discussion of building block and more advanced regular expression concepts (e.g., possessive quantifier or atomic grouping, named capturing groups, lookahead and lookbehind, etc.), including a discussion of differences in engine implementations and feature support. Chapter 3 is a hundred-plus page tutorial on how to work with regular expressions using different programming and scripting languages, including potential gotchas and workarounds. Chapters Four through Eight contain the recipes for solving real-world problems, with tips on how to improve an initial solution's readability (e.g., use named capturing groups when possible, etc.) and/or efficiency.

I was initially skeptical about the authors' ambitious goal of covering so many regular expression flavors, thinking the discussions of differences in engine supported features might prove distracting. The book is written and organized so well, however, my fear did not materialize. In fact, I was pleasantly surprised to learn that: of the covered flavors, Microsoft's DotNet regex engine supports some of the most advanced features.

There's not much to dislike about this book but if I were asked to suggest one or two things that might be of value-add to readers, I would suggest making available for download files containing appropriate subject strings for testing the book's various recipes as a convenience to readers who learn best by doing and want to follow along as they read the recipes, and for the book to include, for easy reference, a feature-support comparison matrix of the covered flavors, much like the comparison table available in the regular-expressions.info website.

5 stars VERY VERY HIGHLY RECOMMENDED!!

2009-06-11     11 of 11 found this review helpful

Do you regularly work with text on a computer? If you do, then this book is for you! Authors Jan Goyvaerts and Steven Levithan, have done an outstanding job of writing a book that shows you how you can use regular expressions in situations where people with limited regular expressions experience would normally say it can't be done.

Goyvaerts and Levithan, begin by explaining the role of regular expressions and introduce a number of tools that will make it easier to learn, create, and debug them. Next, the authors cover each element and feature of regular expressions, along with important guidelines for effective use. Then, they specify coding techniques and include code listings for using regular expressions in each of the programming languages covered by this book. They continue by focusing on recipes for handling typical user input, such as dates, phone numbers, and postal codes in various countries. Next, the authors explore common text processing tasks, such as checking for lines that contain or fail to contain certain words. Then, they show you how to detect integers, floating-point numbers, and several other formats for this kind of input. The authors continue by showing you how to take apart and manipulate the strings commonly used on the Internet and Windows systems to find things. Finally, the authors cover the manipulation of HTML, XML, comma-separated values (CSV), and INI-style configuration files.

This most excellent book shows you everything you need to know about regular expressions, and then some, regardless of whether you are a programmer. More importantly, if you read this book cover to cover, you'll become a world class chef of regular expressions.

5 stars We Have Been Waiting for This One

2009-06-09     11 of 12 found this review helpful

§
I was getting set to write a review of this book, when I happened to visit one of the blogs I regularly read -- Coding Horror. Jeff Atwood says it all for me so please take a look at what he has to say.

If you are a serious programmer or even if you are a Web GUI design person forced to do a bit of JavaScripting like me, you are going to run into situations where using a regex engine is the appropriate tool. Regular expressions are not easy to learn and are kind of boring. They are also very powerful.

Most of us learn faster by doing -- and that most often means working from code we or someone else has done before that does something a bit like what we want to do but needs some tweaking or extending or generalizing. If you are like me, you already have a collection of regular expressions to help in this process. This book does better than that by collecting hundreds of examples together in ways that build your understanding while never getting abstract or divorced from the real problems we face.

Your shelf has a place for this book. Recommended.
§

5 stars Superb collection of Regular Expressions recipes is a great teacher

2009-07-07     5 of 6 found this review helpful

Jan Goyvaerts is well-known in the RegEx community; Steven Levithan somewhat less so. But the degree of fame is unimportant, both are Regular Expressions gurus and really know what they are talking about. Goyvaerts also writes and publishes some really cool software tools, including two for dealing with Regular Expressions.

The last great book on Regular Expressions was "Mastering Regular Expressions" by Jeffrey E. F. Friedl, also published by O'Reilly. This book does not replace "Mastering Regular Expressions", but complements it. Between the two volumes, you'll know everything of importance worth knowing about Regular Expressions and their use.

Regular Expressions are used to find specific patterns of text. For anyone working extensively with text of any kind, Regular Expressions are as necessary as water and air to sustaining human life. Most people never get behind the primitive search functions of their word processor or spreadsheet program. Too bad: they're missing a lot.

The ugly part of what they're missing is learning how to use Regular Expressions.

Conceptually, Regular Expressions are difficult for many people (like me) to grasp and even more difficult to learn. A big part of that is the staggering power of Regular Expressions ("regexp" or "regexes"). Want to a single search for specific words that are misspelled? Regex. How about sentences beginning or ending with specific words? Use a regex.

In their cookbook, the authors demonstrate more than a hundred examples. Better yet, they do it in seven common regex flavors. The authors claim "Regular Expressions Cookbook" is all you know to learn how to use Regular Expressions. They do start with the basics, but I question whether this book is all most will need. I think consulting one of the many fine Regular Expression tutorials on the web might be a helpful first step for the utter novice.

The cookbook itself is absolutely marvelous.

There are more than one hundred recipes, beginning with matching literal text; advancing through matching previously matched text again; retrieving a list of all matches; validating formats of things like email addresses, international phone numbers, even European VAT numbers; finding words not preceded or followed by a specific word; and much more.

This is, I short, a book for the true geek to curl up with and read. You may not need the information now, but you will need it someday and just browsing is an effective way to pick it up. Likewise, if you're looking for an immediate solution to a problem right now, just check the Table Of Contents. Odds are you'll find what you're looking for or something real close. Sadly, however, the index isn't very good.

In short, this is the newest benchmark reference for Regular Expressions. With this and "Mastering Regular Expressions", you are going to be a Master of the Universe and do things with text that will leave ordinary mortals awestruck.

Jerry

5 stars Outstanding and unique coverage of the regular expression

2009-07-05     4 of 5 found this review helpful

This book is dedicated to the solving of problems where the use of the regular expression is appropriate, which is just about any situation where a programmer needs to process text in an efficient and concise manner. This book not only solves problems in general, it discusses the most popular regular expression types and compares them throughout the book as various problems are tackled. There are seven regular expression types covered in this book, and chapter three contains code listings for regular expression use in CSharp, Java, Javascript, PHP, Perl, Python, Ruby, and VB.

My perspective when coming to this book was that of someone who first used regular expressions in Perl, but wanted to really be an expert on this useful subject and also to transport what I've learned in Perl about regular expressions into the Java language, where I had never used them before. This little book did the trick, and now I feel much more comfortable with this subject. The first three chapters of this book are more timesavers and pointers than specific problem solvers. The meat of the book is from chapter four onward. The following is a listing of each chapter and its contents:

1. Introduction to Regular Expressions - Defines what regular expressions are and introduces you to a number of utilities.

2. Basic Regular Expression Skills - The problems presented in this chapter aren't real world problems. Instead they are technical problems you'll run into while creating and editing regular expressions while solving real world problems.

3. Programming with Regular Expressions - This chapter explains how to implement regular expressions with your programming language. The recipes here assume you already have a regular expression and now you need to insert it into one particular language.

4. Validation and Formatting - Contains recipes for validating and formatting common types of user input. Some of the solutions show how to allow variations of valid input, such as postal codes that can contain either five or nine numbers.

5. Words, Lines, and Special Characters - Contains recipes that find and manipulate text. Some of the recipes show how to do things you would find in a search engine, such as finding any one of several words or finding words that appear next to one another. Other examples help you find entire lines that contain particular words, remove repeated words, or escape regular expression metacharacters.

6. Numbers - Regular expressions don't understand that numbers have a mathematical meaning. However, sometimes you still need to process them inside the regular expression rather than passing them to a programming language that can deal with their numerical interpretation. This chapter deals with this issue.

7. URLs, Paths, and Internet Addresses - The URL format has proven so flexible that it has been adopted for a wide range of uses. This chapter helps you process and parse URLs in a variety of settings.

8. Markup and Data Exchange - Deals with common tasks that arise when working with common markup languages and formats such as HTML, XHTML, XML, CSV, and INI. A brief description of each technology is included. The chapter concentrates on the basic syntax rules needed to correctly search the data structures of each of these formats.

I highly recommend this book for anyone who needs extensive coverage on this very useful subject. As a Java and Perl programmer, I can say that other books do discuss the regular expression in the context of these languages, but none will cover so many situations as this one does.

5 stars Jan is a regex guru!

2009-06-26     4 of 4 found this review helpful

Preface: When I first dove into regular expressions two years ago, I jumped in head first with Jeffrey Friedl's classic: "Mastering Regular Expressions - 3rd Edition" (MRE3). I've read it twice so far and it is truly a masterpiece (very highly recommended). As I was learning to "think in regex", I needed some reliable tools with which to practice my newfound regex skills (I'm primarily a Windows guy). Although my text editor of choice at the time had three flavors of built-in regex support (UltraEdit32), its support for Perl compatible syntax had a few inconsistencies. A search for better regex tools led me to the EditPad Pro text editor and RegexBuddy, both from "Just Great Software" (JGSoft). These tools proved to be so well designed, bug-free and useful, that I decided to purchase the much more expensive PowerGrep (and was blown away with what I can do with that!) Armed with these tools and my newfound mastery of regular expressions (courtesy of MRE3), I now feel that I could conquer nearly any challenge from the world of text processing. I was very impressed with the quality, accuracy and attention to detail of all the JGSoft's software products, and when the author of these tools announced that he had a new book coming out on regular expressions, I pre-ordered it sight unseen. I knew it would be good. And it is. (Note that I am not affiliated in any way with JGSoft, I am just a very happy user of their software.)

Review: Jan Goyvaerts is one of the world's experts in the field of regular expressions. He is an "attention to details" and "we will serve no wine before its time" kind of guy, so I was not surprised to find "Regular Expressions Cookbook" well organized, accurate, easy to read, and having very few typos and/or grammatical errors. As far as I could tell (and I'm very nit picky), the recipe regexes presented are both accurate and efficient. Each recipe begins with a statement of the problem, the regex solutions, and then followed by an in-depth discussion/explaination of what is going on inside each regex component. And for each "recipe", multiple regexes are typically provided covering a spectrum of specificity (starting with general "easy-matches" moving to the more specific). And regexes are provided for each of the many supported flavors (i.e. .NET, Java, Javascript, PCRE, Perl, Python and Ruby), complete with the required regex modifiers/options. And where appropriate, footnotes are provided that further describe the peculiarities regarding a specific regex flavor. Attempting to cover all modern regex flavors in one book was a tall order, but this book does it well. In addition to the many regex flavors, Chapter 3 covers regex handling with many different programming languages (C#, VB.NET, Java, Javascript, PHP, Perl, Python and Ruby). Although presented in "Cookbook" fashion, the order of the recipes is such that if you start at the beginning (recipes start in Chapter 2), the basics of regular expression syntax are presented in a logical, progressive manner, and thus, this book also functions as an effective tutorial.

However, I'm not personally a big fan of "Cookbook" style books in general, and prefer to learn a subject in depth in a systematic manner, and figure out the recipes later for myself. And in this regard, this "Cookbook" is not the best regex tutorial on the block (go to MRE3 for that). But if you don't have the time or inclination to learn regex in depth, and just need a solution to a specific problem right now, this book will serve up an accurate, efficient regex solution, no matter which tool or regex flavor you are using. And in one regard, the "Cookbook" beats MRE3 because it covers the Javascript flavor (which is not covered at all in MRE3). One other minor "Cookbook" deficiency is that it does not cover the new recursive expressions found in the latest versions of PHP/PCRE.

Note: If you are a Windows regex user, the Regexbuddy program is highly recommended - it has built-in libraries containing nearly all of the regexes provided in this "Cookbook" as well as built-in regex related code snippets for all of the programming languages as well. Its an excellent piece of software and an indispensible aid to the process of learning regular expressions. (It even has a built-in private forum where you can ask questions of the author directly!) And be sure to check out Jan's resource: [...] for free online regex tutorials and reference. Bottom line: Jan knows regular expressions and is very adept at explaining them!

5 stars Both for the absolute beginners and for the experienced user

2009-06-25     4 of 4 found this review helpful

I've been programming for years, but somehow I discovered building regular expressions way too late. I love the babysteps in the book, in the real life world examples you start with a simple working example which is improved upon and improved upon in areas of performance, and avoiding false matches.
Having used the tools from Jan Goyvaerts for years this book was a must-have for me, I have already learned so much from his online work, and this book even expands on it. Having to program in several languages, knowing about all the differences in the different regex flavors adds to the usability of the book.

5 stars Very useful and clear

2010-01-29     3 of 3 found this review helpful

I have a copy of Mastering Regular Expressions by Jeffrey E. F. Friedl on my bookshelf. I bought it a long time ago to try to improve my skills at using regular expressions to search text and check input against desired norms. While that book is clear and well written, I am sometimes a bit impatient and it was taking too long for me to figure out how to do the things I wanted to do and I got distracted or busy before I read enough to complete the task (I ended up using Google and finding what I needed quickly). I have to admit that I still don't have the regular expression skills I want to have, although this book promises to teach them to me. Someday it may do so.

What I was looking for was a book that would teach regular expressions while giving concrete examples of real life use cases that I could immediately put to work. This book is filled with them.

Chapters one and two lay the foundation by covering the basics of what regular expressions are, using them to search and replace, match text, and other basic skills. This is good, but where the book really sets itself apart is in chapters three through eight, which are overflowing with useful recipes for things like validating ISBNs, finding URLs within text, stripping leading zeros or matching IP addresses (IPv4 and IPv6). The book has an obvious organization scheme, a ton of useful recipes, and a useful index. Finding what you want or need is very easy to do, and unless your needs are especially unique or esoteric, you will probably discover exactly what you require in the book.

The best part of the book is that every example uses a clear format that sets the stage for an easy discovery of needed information.

First, a problem is stated, such as in chapter four's item, 4.1 Validate Email Addresses, which says, "You have a form on your website or a dialog box in your application that asks the user for an email address. You want to use a regular expression to validate this email address before trying to send email to it. This reduces the number of emails returned to you as undeliverable."

Next, a solution is defined, with code examples, accompanied by a description of the particular details that are vital to comprehend when implementing the solution. Next, each recipe has a section for further discussion that leads to a deeper understanding of the regular expression being used and the context in which it is being used.

Especially wonderful is that every recipe has very specific and clear code examples for use with Perl, PCRE (the "Perl Compatible Regular Expressions" library for C, which isn't identical to Perl's use of regular expressions, even though it tries), .NET, Java, JavaScript, Python, PHP, and Ruby with notes on which specific release versions or variations of each are covered. When differences exist in the implementation in these environments, those differences are clearly noted and discussed. This feature will make life much easier for people who need to use regular expressions in more than one language context and is a feature of the book I appreciate greatly.

The other regex book on my shelf will remain there until that mystical moment "when I have time to study it." This book will be used regularly as a reference.

5 stars Taming of the Shrew!

2009-06-25     3 of 4 found this review helpful

I'm really enjoying my copy of Regular Expressions Cookbook... got it last week and I'm finding interesting and different perspectives on problems I solve everyday in Perl... I've been using Perl and regexes for almost a dozen years now, and I have many of the same regex solutions as the cookbook, but every once in a while I'm seeing another way to do something! TIMTOWTDI! as they say in Perl. Thanks for a good contribution to the regex literature... the Cookbook goes well on my nightstand right next to Jeff Friedl's Mastering Regular Expressions (all three editions!) BTW, I picked up another copy of the Cookbook to leave at the office for my team's reference.

5 stars Excellent book

2009-08-24     2 of 2 found this review helpful

This book is very well written and the author definitely knows the subject matter.

The book has a great format: after a brief tutorial, it is organized by prolem solution.
Each problem/solution is further organized by dialect (javascript, .net, perl, etc.) so
you can quickly zero in on what you are looking for.

Each prolem/solution has the source code and explanation so you can adapt the code if needed.

Regular Expressions are extremely valuable, but they are also prone to misuse resulting in colossal headaches.
This book will help you avoid those headaches.

5 stars This Book is a Godsend

2009-07-15     2 of 3 found this review helpful

I have been using Jan Goyvaerts' RegexBuddy software for a couple of years. You know the drill - a little regex here and there just to make the code interesting. Now I'm in the middle of project that was made for Regex: thousands of "smart" part numbers and descriptions that need to be decoded according to a couple of hundred patterns with thousands of permutations. This book has helped me so much. It's so lucid and well organized, you'll be up and running in no time.

5 stars A Great Resource

2009-06-27     2 of 3 found this review helpful

This is another excellent item in the O'Reilly Cookbook series. It covers a lot of ground and the writing is clear and concise. If you want to get good good at RegEx this this is the book to get.

Even though there's an introductory chapter on RegEx, I think you need to be somewhat familiar with RegEx to make good use of this book. Yu don't need to be an expert to use this book, but I suggest spending time with the chapters 1 and 2, then playing with RegEx on your own before proceeding. The introduction gives you a list of RegEx tools to use. I recommend RegEx Buddy, which I've been using for 4+ years. I was writing my own RegEx application until I found it and decided to save my time for problems that hadn't been solved. RegEx Buddy helps you build RegExs and is also a great testing tool, where you can use real data to verify your design.

I haven't had time to go try everything out in the book, yet, but so far I haven't found mistakes. The book covers the .NET, Java, JavaScript, PCRE, Perl, Python, ad Ruby flavors of RegEx. It give examples in C#, VB.NET, Java, JavaScript, PHP, Perl, Python, and Ruby. For all intents and purposes VB6 uses the .NET flavor, though the code implementation is not as clean. There are many good online articles that cover this topic (e.g. http://support.microsoft.com/kb/818802).

The recipes are all practical, some fairly simple, some quite complex. What makes them all useful is the the way they're organized. There' s clear statement of the Problem, followed by an explanation of the Solution with the regular expression (in all flavors) and a specific language example. Section 3.5 gives all the language examples for its problem.

The Discussion sections are the best part of the book. Each recipe gets a through explanation of what goes into the RegEx creation (or the code implementation). You'll learn how to think more effectively in RegEx from these discussions. It's not just the recipes you're getting, but the principles behind designing with RegEx.

The only weak point of the book is that there's not a separate index for the recipes. However, the table of content is sufficient for that.

5 stars Thank you and bravo to Steven Levithan and Jan Goyvaerts

2009-09-21     1 of 1 found this review helpful

I am not a great regular expressions expert rather a beginner in some ways but this book is definitely well organized and is one of the best I have ever seen for finding solutions in using regular expressions with a good well organized library of formulas in several programming and scripting languages as well as some pretty advance topics this book has something that could benefit everyone using it.
I don't have enough stars to accurately give toward this book.
Thank you and bravo to Steven Levithan and Jan Goyvaerts

5 stars Great book!

2009-08-14     1 of 2 found this review helpful

I love the O'Reilly Cookbook series. The format is very usable, and very helpful. This book continues the trend.

This book fully met my expectations, giving me easy access to syntax for common Regular expression problems I need to solve. The recipes are all practical, but solve a range of simple to complex problems.

I will keep this book close by and use it often, it is a great reference tool!

5 stars A Valuable Introduction & Cookbook for String & File Handling Using Regular Expressions

2009-06-22     1 of 1 found this review helpful

This O'Reilly Cookbook is everything it promises and more. It provides a comprehensive and easy to follow cookbook for one of the most valuable tools in any programmer's arsenal. Regular expressions have been for many years and remain the single most powerful multi-language tool for finding, indexing, correcting and modifying text found in computer files. While this valuable tool set is available in a wide range of modern programming and scripting languages and is perhaps the single most powerful instrument for searching and modifying text files--it has, until now, been a tool with a steep learning curve for the non-mathematically inclined programmer. By providing a useful, yet carefully graduated set of examples, and illustrating the use of instructional software including but not limited to one of the authors' RegexBuddy this cookbook provides the tools a programmer needs to understand and fully exploit these powerful methods for text search and modification. The well-informed programmer, especially one schooled by this cookbook can use these tools to rapidly find complicated lexical strings, search and cross-index a text, recognize and syntax check a source code file, or process specialized languages in XML or TEX. One small nit, I would have with this invaluable book, would be that it provide some historical background on these invaluable programming tools, and trace their origin to the SNOBOL string processing language, UNIX-text processing tools like AWK and SED and Stephen Kleene's Metamathematical notation.

This is an extremely necessary book for any programmer's library. I would also recommend that the author and O'Reilly offer a package deal for Jan's Invaluable Regular Expression learning and programming tools RegexBuddy and PowerGREP.

--Ira Laefsky

3 stars Regular Expressions Can be complicated

2010-01-03     0 of 0 found this review helpful

This book is not for a Regular Expressions Novice which I am. I was hoping for a book that would first explain the concepts and methodoligies before diving into examples. But based on the title of the book I should have know better. The cover states the there is a tutorial but I couldn't find it.

Now that I have the bad part out of the way, the good part is that it gives you a tremendous amount of examples. So if you are looking for a book that gives you the answer on a specific expression this is the book for you.

5 stars Learn how to format input, manage lines, find solutions for common markups and paths, and more

2009-10-13     0 of 0 found this review helpful

Jan Goyvaerts' and Steven Levithan's REGULAR EXPRESSIONS COOKBOOK provides over 100 recipes to help blend data and text with regular expressions, offering programmers and software development collections a powerful set of step-by-step instructions for using C#, Java, Perl, Python and VB.Net, among others. Learn how to format input, manage lines, find solutions for common markups and paths, and more in this advanced programmer's cookbook of basics.

5 stars For Only $30 It'll Pay For Itself In One Use

2009-10-11     0 of 0 found this review helpful

This is the kind of book one doesn't actually read (unless you're a masochist ;-), but is essential to have around. Let's face it, very few of us are good with regular expressions. If you aren't using them every day (which most of us aren't) they are just some kind of black-magic stuff we need to do every now and then.

From what I've seen the examples and explanations are clearly written, and the fact that they show - and explain - solutions for Perl, .NET, Java, JavaScript, Python and Ruby makes this book too good to pass up.

2 stars Confusing organization; too specific or too general

2009-09-14     0 of 10 found this review helpful

I'm doing more and more regular expression so thought this was a book I had to have especially considering it's an O'Reiily. However, I've found it's organization confusing. The recipes are either too general or too specific to be of any use. My approach to problem solving is that for every problem there is already a solution. You just need to find it, plug in your data, and turn the crank. Unfortunately, I haven't been able to use this book.

Buy it from AmazonNew for $29.69