Skip to content Skip to sidebar Skip to footer

Regex To Extract First 10 Characters Of A String

There are two scenarios First case: I have an html string for example, var str1='hello how are you

how to extract the first 10 characters of a htmlstri

Solution 1:

In light of your rephrased question;

Regexp is not a good tool for handeling html.

The proper way is to parse the DOM. Jack gives an example of this, but with the assumption that the markup you want to keep is the first child of the node you are looking at.

The question I link above indicates that this is not the case. However Jack's solution can be adapted to handle arbitrary nesting. I do this by simply counting characters of the nodes until I get to the break point. Then recursivly modifying the final node. Finaly I drop all nodes that occur after the required number of characters have been found.

functiongetNodeWithNChars(capture,node)
{
  var len=node.childNodes.length;
  var i=0;
  var toRemove=[];
  for(;i<len;i++)
  {
     if (capture===0)
     {
       toRemove.push(node.childNodes[i]);
     }
    elseif (node.childNodes[i].textContent.length<capture)
    {
       capture=capture-node.childNodes[i].textContent.length;
    }
    else
    {
      if(node.childNodes[i].childNodes.length===0)
      {
        node.childNodes[i].textContent=node.childNodes[i].textContent.substring(0,capture);
        capture=0;
      }
      else
      {
        node.childNodes[i]=getNodeWithNChars(capture,node.childNodes[i]);
        capture=0;
      }
    }
  }
  i=0;
  for(;i<toRemove.length;i++)
  {
    node.removeChild(toRemove[i]);
  }
  return node;
}

functiongetNChars(n,str)
{
  var node = document.createElement('div');
  node.innerHTML = str;
  node=getNodeWithNChars(n,node);
  return node.innerHTML;
}

Example of invocation of the above function;

console.log(getNChars(25,"hello how are <b>you</b> <em>how <b>to extract the</b> first 25 characters of a htmlstring without losing the html of the string</em>"));

Solution 2:

try this :

var str1="<b>hello how are you</b></h1>how to extract the first 10 characters of a htmlstring without losing the html of the string</h1>";
var res = str1.replace(/<(.*?\>)(.{11}).*/, '<$1$2</$1');
console.log(res);

Solution 3:

How about this:

regex = /(<[a-z0-9]+>|)([a-z0-9 ]{0,10})[a-z0-9 ]*(<\/[a-z0-9]+>|).*/

str1 = "hello how are you.how to extract the first 10 characters of a htmlstring without losing the html of the string"console.log(str1.replace(regex, '$1$2$3'))

str1 = "<b>hello how are you</b><h1>how to extract the first 10 characters of a htmlstring without losing the html of the string</h1>"console.log(str1.replace(regex, '$1$2$3'))

Post a Comment for "Regex To Extract First 10 Characters Of A String"