× {{alert.msg}} Never ask again
Get notified about new tutorials RECEIVE NEW TUTORIALS

Reading unknown length line

Ray Phan
Feb 02, 2015
<p>One suggestion I have is to use <a href="http://en.wikipedia.org/wiki/Regular_expression" rel="nofollow">regular expressions</a> so that you search for substrings within that string in your example that specifically have one ID, followed by <code>-&gt;</code> followed by another ID. Once we find these exact patterns in your string, we simply extract those out and place them into a cell array. In other words, supposing that our string was stored in <code>s</code> (I'm actually going to use your example), do this:</p> <pre><code>s = 'bobaboao dsaas : 5-&gt;2 2-&gt;3 4-&gt;6 7-&gt;2 1-&gt;4 5-&gt;1 8-&gt;1 222-&gt;1 23-&gt;13'; g = regexp(s, '[0-9]+-&gt;[0-9]+', 'match'); </code></pre> <p>Let's go through this code slowly. <code>s</code> stores the string that you're analyzing, then the next line finds substrings in your string <code>s</code> that finds a sequence of at least one digit, followed by a <code>-&gt;</code>, followed by at least one digit. The <code>'match'</code> flag extracts out the strings that match this pattern we are finding in <code>s</code>. <code>g</code> is the output of this line, and each string is stored in a cell array. We thus get:</p> <pre><code>g = Columns 1 through 7 '5-&gt;2' '2-&gt;3' '4-&gt;6' '7-&gt;2' '1-&gt;4' '5-&gt;1' '8-&gt;1' Columns 8 through 9 '222-&gt;1' '23-&gt;13' </code></pre> <p>Note that storing into a cell array is important, because the length of each substring may be different.</p> <p>Once we extract these substrings, what we can do is extract the numbers <strong>before</strong> and <strong>after</strong> the <code>-&gt;</code>. We simply apply two more regular expression calls to get the numbers before and after:</p> <pre><code>X = regexp(g, '^[0-9]+', 'match'); Y = regexp(g, '[0-9]+$', 'match'); </code></pre> <p>The first call looks for substrings at the beginning of each string in <code>g</code> that starts with a number, while the second call looks for substrings at the end of each string in <code>g</code> that ends with a number. What will be returned are the numbers contained in cell arrays. Also, the numbers themselves are <strong>strings</strong>. Because each element in the cell is a string, we should convert these back into actual numbers. We should also place these into a numeric vector for you to use with your code:</p> <pre><code>X = cellfun(@str2double, X); Y = cellfun(@str2double, Y); </code></pre> <p><a href="http://www.mathworks.com/help/matlab/ref/cellfun.html" rel="nofollow"><code>cellfun</code></a> is a function that allows you to apply a particular function to each cell in a cell array. In this case, we want to convert each number in the cell array as it's a string into <code>double</code>. Therefore, use <a href="http://www.mathworks.com/help/matlab/ref/str2double.html" rel="nofollow"><code>str2double</code></a> to facilitate this conversion. Once we're done, we will get numeric vectors that give you the numbers before the <code>-&gt;</code> and after the <code>-&gt;</code>. </p> <p>We finally get:</p> <pre><code>X = 5 2 4 7 1 5 8 222 23 Y = 2 3 6 2 4 1 1 1 13 </code></pre> <p>This tip was originally posted on <a href="http://stackoverflow.com/questions/27426570/Reading%20unknown%20length%20line/27426922">Stack Overflow</a>.</p>
comments powered by Disqus