Finding line-wraps

Supposing I have some random block of text in a single line. Like so

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

But for whatever reason (width settings on the containing element, use of text-zoom etc.), on the viewer's screen it displays as two or more lines.

Lorem ipsum dolor sit amet,

consectetur adipiscing elit.

or

Lorem ipsum dolor sit

amet, consectetur

adipiscing elit.

Is there any way to find out via javascript where those line-wraps happen?

$('p').text() and $('p').html() return Lorem ipsum dolor sit amet, consectetur adipiscing elit. regardless of how the text is displayed.

Answers:

Answer

Well, if you want something that's ridiculously simple and probably too useless for you (it'll need major modification if you have any sort of HTML inside the paragraph), then have a look at this:

var para = $('p');

para.each(function(){
    var current = $(this);
    var text = current.text();
    var words = text.split(' ');

    current.text(words[0]);
    var height = current.height();

    for(var i = 1; i < words.length; i++){
        current.text(current.text() + ' ' + words[i]);

        if(current.height() > height){
            height = current.height();
            // (i-1) is the index of the word before the text wraps
            console.log(words[i-1]);
        }
    }
});

It's so ridiculously simple it might just work. What this does is to break up the text by spaces, then append the words back word by word, watching for any increase in the height of the element, which would indicate a line wrap.

Have a look at it here: http://www.jsfiddle.net/xRPYN/2/

Answer

For a use case like pdf generation.

You can limit to characters per line, if a split occurs middle word, adjust appropriately.

To gain a more accurate characters per line you can use monospaced fonts then determine the width per character for each font allowed. Then divide the character width by the size of the allowed text line width, and you'll have the allowed characters per line for that font.

You could use non monospaced fonts, but then you'll have to measure each letter's width - ugh. A way you can automate the width guessing is having a span that has no margin or padding, add in each character for each font (and size) then measure the width of the span and use that.

I've done up the code:

/**
 * jQuery getFontSizeCharObject
 * @version 1.0.0
 * @date September 18, 2010
 * @since 1.0.0, September 18, 2010
 * @package jquery-sparkle {@link http://www.balupton/projects/jquery-sparkle}
 * @author Benjamin "balupton" Lupton {@link http://www.balupton.com}
 * @copyright (c) 2010 Benjamin Arthur Lupton {@link http://www.balupton.com}
 * @license Attribution-ShareAlike 2.5 Generic {@link http://creativecommons.org/licenses/by-sa/2.5/
 */
$.getFontSizeCharObject = function(fonts,sizes,chars){
    var fonts = fonts||['Arial','Times'],
        sizes = sizes||['12px','14px'],
        chars = chars||['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','y','x','z',
                        'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','Y','X','Z',
                        '0','1','2','3','4','5','6','7','8','9','-','=',
                        '!','@','#','$','%','^','&','*','(',')','_','+',
                        '[',']','{','}','\\','|',
                        ';',"'",':','"',
                        ',','.','/','<','>','?',' '],
        font_size_char = {},
        $body = $('body'),
        $span = $('<span style="padding:0;margin:0;letter-spacing:0:word-spacing:0"/>').appendTo($body);

    $.each(fonts, function(i,font){
        $span.css('font-family', font);
        font_size_char[font] = font_size_char[font]||{};
        $.each(sizes, function(i,size){
            $span.css('font-size',size);
            font_size_char[font][size] = font_size_char[font][size]||{};
            $.each(chars,function(i,char){
                if ( char === ' ' ) {
                    $span.html('&nbsp;');
                }
                else {
                    $span.text(char);
                }
                var width = $span.width()||0;
                font_size_char[font][size][char] = width;
            });
        });
    });

    $span.remove();

    return font_size_char;
};

/**
 * jQuery adjustedText Element Function
 * @version 1.0.0
 * @date September 18, 2010
 * @since 1.0.0, September 18, 2010
 * @package jquery-sparkle {@link http://www.balupton/projects/jquery-sparkle}
 * @author Benjamin "balupton" Lupton {@link http://www.balupton.com}
 * @copyright (c) 2010 Benjamin Arthur Lupton {@link http://www.balupton.com}
 * @license Attribution-ShareAlike 2.5 Generic {@link http://creativecommons.org/licenses/by-sa/2.5/
 */
$.fn.adjustedText = function(text,maxLineWidth){
    var $this = $(this),
        font_size_char = $.getFontSizeCharObject(),
        char_width = font_size_char['Times']['14px'],
        maxLineWidth = parseInt(maxLineWidth,10),
        newlinesAt = [],
        lineWidth = 0,
        lastSpace = null;

    text = text.replace(/\s+/g, ' ');

    $.each(text,function(i,char){
        var width = char_width[char]||0;
        lineWidth += width;
        if ( /^[\-\s]$/.test(char) ) {
            lastSpace = i;
        }
        //console.log(i,char,lineWidth,width);
        if ( lineWidth >= maxLineWidth ) {
            newlinesAt.push(lastSpace||i);
            lineWidth = width;
            lastSpace = null;
        }
    });

    $.each(newlinesAt,function(i,at){
        text = text.substring(0,at+i)+"\n"+text.substring(at+i);
    });

    text = text.replace(/\ ?\n\ ?/g, "\n");

    console.log(text,newlinesAt);

    $this.text(text);

    return $this;
};

$(function(){
    var $body = $('body'),
        $textarea = $('#mytext'),
        $btn = $('#mybtn'),
        $div = $('#mydiv');

    if ( $textarea.length === 0 && $div.length === 0 ) {
        $body.empty();

        $textarea = $('<textarea id="mytext"/>').val('(When spoken repeatedly, often three times in succession: blah blah blah!) Imitative of idle, meaningless talk; used sometimes in a slightly derogatory manner to mock or downplay another\'s words, or to show disinterest in a diatribe, rant, instructions, unsolicited advice, parenting, etc. Also used when recalling and retelling another\'s words, as a substitute for the portions of the speech deemed irrelevant.').appendTo($body);
        $div = $('<div id="mydiv"/>').appendTo($body);
        $btn = $('<button id="mybtn">Update Div</button>').click(function(){
            $div.adjustedText($textarea.val(),'300px');
        }).appendTo($body);

        $div.add($textarea).css({
            'width':'300px',
            'font-family': 'Times',
            'font-size': '14px'
        });
        $div.css({
            'width':'auto',
            'white-space':'pre',
            'text-align':'left'
        });
    }

});
Answer

Here's what I ended up using (feel free to critique and copy for your own nefarious purposes).

First off, when the edit comes in from the user, it's broken up with $(editableElement).lineText(userInput).

jQuery.fn.lineText = function (userInput) {
   var a = userInput.replace(/\n/g, " \n<br/> ").split(" ");
   $.each(a, function(i, val) { 
      if(!val.match(/\n/) && val!="") a[i] = '<span class="word-measure">' + val + '</span>';
   });
   $(this).html(a.join(" "));
};

The newline replacement happens because the editing textbox is populated with $(editableElement).text(), which ignores <br/> tags, but they will still change the height of the following line in the display for typesetting purposes. This was not part of the initial objective, just fairly low-hanging fruit.

When I need to pull out formatted text, I call $(editableElement).getLines(), where

jQuery.fn.getLines = function (){
   var count = $(this).children(".word-measure").length;
   var lineAcc = [$(this).children(".word-measure:eq(0)").text()];
   var textAcc = [];
   for(var i=1; i<count; i++){
      var prevY = $(this).children(".word-measure:eq("+(i-1)+")").offset().top;
      if($(this).children(".word-measure:eq("+i+")").offset().top==prevY){
         lineAcc.push($(this).children(".word-measure:eq("+i+")").text());
   } else {
     textAcc.push({text: lineAcc.join(" "), top: prevY});
     lineAcc = [$(this).children(".word-measure:eq("+i+")").text()];
   }
   }
   textAcc.push({text: lineAcc.join(" "), top: $(this).children(".word-measure:last").offset().top});
   return textAcc;
};

The end result is a list of hashes, each one containing the content and vertical offset of a single line of text.

[{"text":"Some dummy set to","top":363},
 {"text":"demonstrate...","top":382},
 {"text":"The output of this","top":420},
 {"text":"wrap-detector.","top":439}]

If I just want unformatted text, $(editableElement).text() still returns

"Some dummy set to demonstrate... The output of this wrap-detector."
Answer

The solutions above don't work once you have more complex structure like a link in a paragraph (e.g. you can have <b><i><a href></a> within a <p>).

So I made a javascript library to detect where lines wrap that works in those cases: http://github.com/xdamman/js-line-wrap-detector

I hope this helps.

Answer

I have a situation where I need to wrap each line in a span. I do this so that I can add a padded highlight effect to a text block. Adding the background to a span tag that wraps the text will only pad the beginning and ending of the text block, each line must be wrapped individually.

This is what I came up with based on the suggestions above:

$.fn.highlghtWrap = function () {
    this.each( function () {
      var current = $( this );
      var text = current.text();
      var words = text.split( ' ' );
      var line = '';
      var lines = [];

      current.text( words[ 0 ] );
      var height = current.height();
      line = words[ 0 ];
      for ( var i = 1; i < words.length; i++ ) {
        current.text( current.text() + ' ' + words[ i ] );

        if ( current.height() > height ) {
          lines.push( line );
          line = words[ i ];
          height = current.height();
        } else {
          line = line + ' ' + words[ i ];
        }
      }
      lines.push( line );
      current.html( '' );
      $.each( lines, function ( v, a ) {
        current.html( current.html() + '<span>' + a +
          ' </span>' );
      } );
    } );
  }

  $( '.home-top_wrapper h2' ).highlghtWrap();
  $( '.home-top_wrapper p' ).highlghtWrap();
Answer

A conceptually simple way that also works when there's internal markup and arbitrary fonts and styles, is to make a first pass that simply puts every word into its own element (maybe 'SPAN', or a custom name like 'w').

Then you can iterate using getBoundingClientRect() to find where the 'top' property changes:

function findBreaks() {
    var words = document.getElementsByTagName('w');
    var lastTop = 0;
    for (var i=0; i<words.length; i++) {
        var newTop = words[i].getBoundingClientRect().top;
        if (newTop == lastTop) continue;
        console.log("new line " + words[i].textContent + " at: " + newTop);
        lastTop = newTop;
    }
}

It sounds slow, but unless the documents are really big you won't notice.

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us Javascript

©2020 All rights reserved.