RELEASE 12.3 KB
Newer Older
Sebastian Dröge's avatar
Sebastian Dröge committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14
0.4.23
======

  - Various improvements to the NEON backend to bring it closer to the SSE
    backend (Wim Waymans)
  - Add support for setting a custom backup function (Wim Taymans)
  - Preserve NEON/VFP registers across subroutines (Jerome Laheurte)
  - Fix 64 bit parameter loading on big-endian systems (Tim-Philipp Müller)
  - Improved implementations for various opcodes (Wim Taymans)
  - Various improvements and fixes to constants handling (Wim Taymans)
  - Avoid some undefined operations on signed integers (Wim Taymans)
  - Prefer user specific directories over global ones for intermediate files
    to prevent name collisions (Fabian Deutsch)

Sebastian Dröge's avatar
Sebastian Dröge committed
15 16 17 18 19 20 21 22 23 24 25
0.4.22
======

Maintenance release:

  - Handle NOCONFIGURE=1 in autogen.sh (Colin Walters)
  - Some memory leak fixes in the compiler (Sebastian Dröge, Thiago Santos)
  - Fixes for compiler warnings on Win64 (Edward Hervey)
  - Properly detect CPU features on Android in non-debug build (Jan Schmidt)
  - Use Android logging system instead of stderr for debug output (Jan Schmidt)

Sebastian Dröge's avatar
Sebastian Dröge committed
26 27 28 29 30 31 32 33 34 35
0.4.21
======

Maintenance release:

  - Add libtool versioning to the linker flags again. This was accidentially
    removed in 0.4.20 but should not cause any problems on platforms other
    than OS X (Sebastian Dröge)


Sebastian Dröge's avatar
Sebastian Dröge committed
36 37 38 39 40 41 42 43 44 45 46 47 48
0.4.20
======

Maintenance release:

  - Fix list corruption when splitting code memory chunks, causing crashes
    when allocating a lot of code memory and trying to free it later
    (Tim-Philipp Müller)
  - Add some extra checks for the number of variables used in ORC code to
    prevent overflows and crashes in the compiler (Vincent Penquerc'h)
  - Various compiler warnings, coverity warnings and static code analysis
    fixes (Sebastian Dröge)

Sebastian Dröge's avatar
Sebastian Dröge committed
49 50 51 52 53 54 55 56 57
0.4.19
======

Maintenance release:

  - Fix out-of-tree builds (Edward Hervey)
  - Fix many memory leaks, compiler warnings and coverity warnings (Tim-Philipp Müller,
    Olivier Crête, Todd Agulnick, Sebastian Dröge, Vincent Penquerc'h, Edward Hervey)
  - Documentation fix for mulhsw, mulhuw (William Manley)
David Schleef's avatar
David Schleef committed
58

David Schleef's avatar
David Schleef committed
59 60 61 62 63 64 65 66 67 68 69
0.4.18
======

Maintenance release:

 - Important bugfix in reading constants from bytecode. (Tim-Philipp Müller
   and Sebastian Dröge)
 - Documentation and code cleanup (Stefan Sauer)
 - Fix cache flushing on iOS (Andoni Morales Alastruey)


David Schleef's avatar
David Schleef committed
70 71 72 73 74 75 76 77 78 79 80 81
0.4.17
======

Maintenance release:

 - Merged known distro patches.
 - Added MIPS backend (Guillaume Emont).
 - Disabled ARM backend because of poor coverage.
 - Added bytecode parsing and writing.  This can be used instead of
   manual creation of OrcPrograms.


David Schleef's avatar
David Schleef committed
82 83 84 85 86 87 88 89 90 91
0.4.16
======

Fix a few bugs people noticed in 0.4.15.

 - orc_init() tried to take the same mutex as generated C code that
   calls (indirectly) orc_init().
 - sse: Fixes for 64 bit pointers with any of the upper 32 bits set.


David Schleef's avatar
David Schleef committed
92 93 94 95 96 97 98 99 100 101 102 103
0.4.15
======

This should have been release much earlier.

 - Protect global resources with mutexes.  Duh.  This solves a bunch
   of bug reports.
 - Restore c64x-c backend.  Untested.
 - Convert MMX and SSE backends to a new instruction scheduler.
 - Add alignment and size hints to parser.


David Schleef's avatar
David Schleef committed
104
0.4.14
David Schleef's avatar
David Schleef committed
105 106 107 108 109 110 111
======

Yet more bug fixing.  Altivec should work again, OS/X should
work again.  MMX should work again.  Another codegen bug on
SSE fixed.


David Schleef's avatar
David Schleef committed
112 113 114 115 116 117 118 119
0.4.13
======

Fixes two serious code generation bugs in 0.4.12 on SSE and
Altivec.  Also added some compatibility code to mitigate
the previous automatic inclusion of stdint.h.


David Schleef's avatar
David Schleef committed
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
0.4.12
======

This is primarily a bug fixing release.

 - Fix gcc-4.6 warnings in generated code
 - Codegen fixes for Altivec.  Passes regression tests again.
 - More error checking for code allocation.
 - NEON: floating point improvements
 - Removed stdint.h from API.  This could theoretically cause
   breakage if you depended on orc to include stdint.h.

One new feature is the OrcCode structure, which keeps track of
compiled code.  This now allows applications to free unused code.

Internally, x86 code generation was completely refactored to add
an intermediate stage, which will later be used for instruction
reordering.  None of this is useful yet.


David Schleef's avatar
David Schleef committed
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
0.4.11
======

This is primarily a bug fixing release.

 - Fixes for CPUs that don't have backends
 - Fix loading of double parameters
 - mmx: Fix 64-bit parameter loading
 - sse/mmx: Fix x2/x4 with certain opcodes

There are still some issues with the ARM backend on certain
architecture levels (especially ARMv6).  Some assistance from
a user with access to such hardware would be useful.


David Schleef's avatar
David Schleef committed
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173
0.4.10
======

Changes:

 - Added several simple 64-bit opcodes
 - Improved debugging by adding ORC_CODE=emulate
 - Allocation of mmap'd areas for code now has several fallback
   methods, in order to placate various SELinux configurations.
 - Various speed improvements in SSE backend
 - Add SSE implementations of ldreslinl and ldresnearl.
 - Update Mersenne Twister example

There was a bug in the calculation of maximum loop shift that, when
fixed, increases the speed of certain functions by a factor of two.
However, the fix also triggers a bug in Schroedinger, which is fixed
in the 1.0.10 release.


David Schleef's avatar
David Schleef committed
174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190
0.4.9
=====

This is primarily a bug fixing release.

Changes:

 - Added handling for 64-bit constants
 - Fix building and use of static library
 - Fix register allocation on Win64 (still partly broken, however)
 - Quiet some non-errors printed by orcc in 0.4.8.
 - Fix implementation of several opcodes.

Until this release, the shared libraries all had the same versioning
information.  This should be fixed going forward.


David Schleef's avatar
David Schleef committed
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211
0.4.8
=====

Changes:

 - Fix Windows and OS/X builds
 - Improve behavior in failure cases
 - Major improvements for Altivec backend
 - Significant documentation additions

Memory for executable code storage is now handled in a much more
controlled manner, and it's now possible to reclaim this memory
after it's no longer needed.

A few more 64-bit opcodes have been added, mostly related to
arithmetic on floating point values.

The orcc tool now handles 64-bit and floating point parameters
and constants.


David Schleef's avatar
David Schleef committed
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258
0.4.7
=====

Changes:

 - Lots of specialized new opcodes and opcode prefixes.
 - Important fixes for ARM backend
 - Improved emulation of programs (much faster)
 - Implemented fallback rules for almost all opcodes for
   SSE and NEON backends
 - Performance improvements for SSE and NEON backends.
 - Many fixes to make larger programs compile properly.
 - 64-bit data types are now fully implemented, although
   there are few operations on them.

Loads and stores are now handled by separate opcodes (loadb,
storeb, etc).  For compatibility, these are automatically
included where necessary.  This allowed new specialized
loading opcodes, for example, resampling a source array
for use in scaling images.

Opcodes may now be prefixed by "x2" or "x4", indicating that
a operation should be done on 2 or 4 parts of a proportionally
larger value.  For example, "x4 addusb" performs 4 saturated
unsigned additions on each of the four bytes of 32-bit
quantities.  This is useful in pixel operations.

The MMX backend is now (semi-) automatically generated from
the SSE backend.

The orcc tool has a new option "--inline", which creates inline
versions of the Orc stub functions.  The orcc tool also recognizes
a new directive '.init', which instructs the compiler to generate
an initialization function, which when called at application init
time, compiles all the generated functions.  This allows the
generated stub functions to avoid checking if the function has
already been compiled.  The use of these two features can
dramatically decrease the cost of calling Orc functions.

Known Bugs: Orc generates code that crashes on 64-bit OS/X.

Plans for 0.4.8: (was 2.5 for 4 this time around, not too bad!)
Document all the new features in 0.4.7.  Instruction scheduler.
Code and API cleanup.



David Schleef's avatar
David Schleef committed
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300
0.4.6
=====

Changes:

 - Various fixes to make Orc more portable
 - Major performance improvements to NEON backend
 - Minor performance improvements to SSE backend
 - Major improvements to ARM backend, now passes regression
   tests.

The defaults for floating point operations have been changed
somewhat: NANs are handled more like the IEEE 754 standard,
and denormals in operations are treated as zeros.  The NAN
changes causes certain SSE operations to be slightly slower,
but produce less surprising results.  Treating denormals as
zero has effects ranging from "slightly faster" to "now possible".

New tool: orc-bugreport.  Mainly this is to provide a limited
testing tool in the field, especially for embedded targets
which would not have access to the testsuite that is not
installed.

The environment variable ORC_CODE can now be used to adjust
some code generation.  See orc-bugreport --help for details.

orcc has a new option to generate code that is compatible
with older versions of Orc.  For example, if your software
package only uses 0.4.5 features, you can use --compat 0.4.5
to generate code that run on 0.4.5, otherwise it may generate
code that requires 0.4.6.  Useful for generating source code
for distribution.

New NEON detection relies on Linux 2.6.29 or later.

Plans for 0.4.7: (not that past predictions have been at all
accurate) New opcodes for FIR filtering, scaling and compositing
of images and video.  Instruction scheduler, helpful for non-OOO
CPUs.  Minor SSE/NEON improvements.  Orcc generation of inline
macros.


David Schleef's avatar
David Schleef committed
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318
0.4.5
=====

This release contains many small improvements related to
converting GStreamer from liboil to Orc.

The major addition in this release is the mainstreaming of
the NEON backend, made possible by Nokia.

There is a new experimental option to ./configure,
--enable-backend, which allows you to choose a single code
generation backend to include in the library.  This is mostly
useful for embedded systems, and is not recommended in general.

The upcoming release will focus on improving code generation
for the SSE and NEON backends.


David Schleef's avatar
David Schleef committed
319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334
0.4.4
=====

This is almost entirely a cleanup and bug fix release.

 - fix register copying on x86-64
 - better checking for partial test failures
 - fix documention build
 - fix build on many systems I don't personally use
 - various fixes to build/run on Win64 (Ramiro Polla)
 - add performance tests

Next release will merge in the new pixel compositing opcodes
and the SSE instruction scheduler.


David Schleef's avatar
David Schleef committed
335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358
0.4.3
=====

New opcodes: all the 32-bit float opcodes from the orc-float
library have been moved into the core library.

New opcodes: splitlw and splitwb, which are equivalent to
select0lw, select1lw, select0wb, and select1wb, except that
the new opcodes split a value into two destinations in one
opcode.

New backend: c64x-c, for the TI C64x+ DSP.  This backend only
produces source code, unlike other backends which can produce
both source and binary code.  Generating code for this backend
can be done using 'orcc --assembly --target=c64x-c'.

Orc now understands and can generate code for two-dimensional
arrays.  If the size of the array is known at compile time,
this information can be used to improve generated code.

Various improvements to the ARM backend by Wim Taymans.  The
ARM backend is still experimental.


David Schleef's avatar
David Schleef committed
359 360 361 362 363 364 365 366 367
0.4.2
=====

Bug fixes to C backend.  Turns out this is rather important on
CPUs that don't have a native backend.

New features have been postponed to 0.4.3.


David Schleef's avatar
David Schleef committed
368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405
0.4.1
=====

This release introduces the orcc program, which parses .orc files and
outputs C source files to compile into a library or application.  The
main source file implements the functions described by the .orc source
code, by creating Orc programs and compiling them at runtime.  Another
source file that it outputs is a test program that can be compiled and
run to determine if Orc is generating code correctly.  In future
releases, the orcc tool will be expanded to output assembly code, as
well as make it easier to use Orc in a variety of ways.

Much of Schroedinger and GStreamer have been converted to use Orc
instead of liboil, as well as converting code that wasn't able to
use liboil.  To enable this in Schroedinger, use the --enable-orc
configure option.  The GStreamer changes are in the orc branch in
the repository at http://cgit.freedesktop.org/~ds/gstreamer

Scheduled changes for 0.4.2 include a 2-D array mode for converting
the remaining liboil functions used in Schroedinger and GStreamer.


Major changes:

 - Add the orcc compiler.  Generates C code that creates Orc programs
   from .orc source files.
 - Improved testing
 - Fixes in the C backend
 - Fix the MMX backend to emit 'emms' instructions.
 - Add a few rules to the SSE backend.



0.4.0
=====

Stuff happened.