Skip to content

matiaslb/tidy_ex

 
 

Repository files navigation

Build status ModestEx version Hex.pm

Broom by faisalovers from the Noun Project

TidyEx

TidyEx corrects and cleans up HTML content by fixing markup errors.

Elixir/Erlang bindings for htacg's tidy-html5

The granddaddy of HTML tools, with support for modern standards http://www.html-tidy.org

The binding is implemented as a C-Node following the excellent example in Overbryd's package nodex. If you want to learn how to set up bindings to C/C++, you should definitely check it out.

  • nodex
    • distributed Elixir
    • save binding with C-Nodes

C-Nodes are external os-processes that communicate with the Erlang VM through erlang messaging. That way you can implement native code and call into it from Elixir in a safe predictable way. The Erlang VM stays unaffected by crashes of the external process.

Example

For more examples please checkout tests.

test "can parse broken html" do
  result = TidyEx.parse("<div>Hello<span>World")
  assert result == "<div>Hello<span>World</span></div>"
end

test "can clean and repair broken html" do
  result = TidyEx.clean_and_repair("<div>Hello<span>World")
  assert result == "<div>Hello<span>World</span></div>"
end

test "can run diagnostics on invalid html" do
  result = TidyEx.run_diagnostics("<pp>Hello World</p>")
  assert result == "line 1 column 1 - Error: <pp> is not recognized!\nThis document has errors that must be fixed before\nusing HTML Tidy to generate a tidied up version."
end

Installation

Available on hex.

def deps do
  [
    {:tidy_ex, "~> 0.1.0-dev"}
  ]
end

Target dependencies

cmake 3.x
erlang-dev
erlang-xmerl
erlang-parsetools

Compile and test

mix deps.get
mix compile
mix test

Cloning

git clone git@github.com:f34nk/tidy_ex.git
cd tidy_ex

All binding targets are added as submodules in the target/ folder.

git submodule update --init --recursive --remote
mix deps.get
mix compile
mix test
mix test.target

Cleanup

mix clean

Roadmap

See CHANGELOG.

  • Bindings
    • Call as C-Node
    • Call as dirty-nif
  • Tests
    • Call as C-Node
    • Call as dirty-nif
    • Target tests
    • Feature tests
    • Package test
  • Features
    • Set tidy-html5 options
    • Serialize any string with valid or broken html
    • Clean and repair
    • Run diagnostics
  • Documentation
  • Publish as hex package

Icon Credit

Broom by faisalovers from the Noun Project

About

Elixir binding to the granddaddy of HTML tools

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • C 60.7%
  • Elixir 13.1%
  • C++ 10.9%
  • CMake 6.3%
  • HTML 4.8%
  • Shell 4.2%