Commit Graph

2404 Commits

Author SHA1 Message Date
Lioncash
726b9914c5 common/fp/op/FPRSqrtEstimate: Add half-precision specialization for FPRSqrtEstimate 2020-04-22 21:01:44 +01:00
Lioncash
2184d24e8f frontend/ir_emitter: Add half-precision opcode for FPRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
af2e5afed6 common/fp/op: Add half-precision specialization for FPRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
d7f394fc1a A64: Enable half-precision vector FRINT* variants 2020-04-22 21:01:44 +01:00
Lioncash
5d5c9f149f frontend/ir_emitter: Add half-precision opcode for FPVectorRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
24f583c498 A64: Enable half-precision variants of floating-point FRINT* variants
With all the backing machinery in place, we can remove the fallback
check for half-precision.
2020-04-22 21:01:44 +01:00
Lioncash
6da0411111 frontend/ir_emitter: Add half-precision opcode for FPRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
fb829b9525 frontend/microinstruction: Add FPVectorRoundInt types to ReadsFromAndWritesToFPSRCumulativeExceptionBits()
All variants were previously missing from this.
2020-04-22 21:01:44 +01:00
Lioncash
68d8cd2b13 common/fp/op: Add half-precision specialization for FPRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
5b4673da4b frontend/ir_emitter: Add half-precision variant of FPVectorRoundInt 2020-04-22 21:01:44 +01:00
Lioncash
ad0c698f89 frontend/ir_emitter: Add half-precision variant of FPRoundInt 2020-04-22 21:01:44 +01:00
Lioncash
61cec94a19 fp/op/FPRoundInt: Add half-precision specialization of FPRoundInt 2020-04-22 21:01:44 +01:00
Merry
cb9a1b18b6 Merge pull request #475 from lioncash/muladd
A64: Enable half-precision variants of floating-point multiply-add instructions
2020-04-22 21:01:44 +01:00
Merry
d6db7ad46c Merge pull request #474 from lioncash/bracing
load_store_*: Make bracing consistent and variables const where applicable
2020-04-22 21:01:44 +01:00
Merry
1b6520f5dd A64/location_descriptor: Ensure FZ16 is included in the FPCR mask 2020-04-22 21:01:44 +01:00
Merry
13f421c27d Merge pull request #473 from lioncash/sqshlu
A64: Implement SQSHLU
2020-04-22 21:01:44 +01:00
Lioncash
b5bf890584 load_store_*: Make bracing consistent and variables const where applicable
Makes bracing consistent, and variables const where applicable to be
consistent with the rest of the codebase.

In most bracing cases, they'd need to be added to conditionals that
would involve checking stack pointer alignment in the future anyways.
2020-04-22 21:01:44 +01:00
Lioncash
9a58c3f1c7 A64: Implement FMLA/FMLS' half-precision vector indexed variants 2020-04-22 21:01:44 +01:00
Merry
d7da53a74b Merge pull request #472 from lioncash/exception
general: Mark hash functions as noexcept
2020-04-22 21:01:44 +01:00
Lioncash
9dcc04e106 A64: Implement SQSHLU's scalar variant 2020-04-22 21:01:44 +01:00
Merry
b91c6c8bae Merge pull request #471 from lioncash/sqrdmulh
A64: Implement SQRDMULH's scalar vector variant
2020-04-22 21:01:44 +01:00
Lioncash
1fdd3ef8a0 A64: Implement FMLA/FMLS' half-precision scalar indexed variants 2020-04-22 21:01:44 +01:00
Lioncash
2d59d10ac8 A64: Implement SQSHLU's vector variant
The vector shift by immediate category is now fully implemented.
2020-04-22 21:01:44 +01:00
Merry
b5e25959d9 Merge pull request #470 from lioncash/assert
general: Replace unreachable-imitating assertions with UNREACHABLE()
2020-04-22 21:01:44 +01:00
Lioncash
d6606deda2 A64: Implement half-precision vector variants of FMLA/FMLS 2020-04-22 21:01:44 +01:00
Lioncash
a4cadf1cd9 frontend/ir_emitter: Add opcodes for signed saturated left shifts with unsigned saturation 2020-04-22 21:01:44 +01:00
Lioncash
ec6b3ae084 ir/frontend: Add half-precision opcode for FPVectorMulAdd 2020-04-22 21:01:44 +01:00
Lioncash
5f74d25bf7 A64: Enable half-precision floating point variants of FP data-processing three register instructions
This handles half-precision floating point for:

- FMADD
- FMSUB
- FNMADD
- FNMSUB
2020-04-22 21:01:44 +01:00
Lioncash
bd82513199 frontend/ir_emitter: Add half-precision opcode for FPMulAdd 2020-04-22 21:01:44 +01:00
Lioncash
79a892d23c fp/op/FPMulAdd: Add half-precision floating-point specialization 2020-04-22 21:01:44 +01:00
Lioncash
7bb5440507 general: Mark hash functions as noexcept
Generally hash functions shouldn't throw exceptions. It's also a
requirement for the standard library-provided hash functions to not
throw exceptions.

An exception to this rule is made for user-defined specializations,
however we can just be consistent with the standard library on this to
allow it to play nicer with it.

While we're at it, we can also make the std::less specializations
noexcpet as well, since they also can't throw.
2020-04-22 21:01:43 +01:00
Lioncash
3b46b4a37d A64: Implement SQRDMULH's scalar vector variant
Implements the scalar variant in terms of the vector variant for the
time being.
2020-04-22 21:01:43 +01:00
Lioncash
fe95575b95 general: Replace unreachable-imitating assertions with UNREACHABLE()
We can just use the self-documenting assertion for indicating
unreachable paths, instead of manually passing false and providing a
message.
2020-04-22 21:01:43 +01:00
Merry
4a3d808354 Merge pull request #468 from lioncash/const
ir_opt: Mark locals as const where applicable
2020-04-22 21:01:43 +01:00
Lioncash
64de80839e A64/impl: Reorganize peculiar void use in V_scalar
To a reader this might look particularly strange, given the function
itself has a void return value, but this is actually valid, given the
function in the return statement also has a void return value.

This instead alters it to be a little easier to parse and potentially be
a little less confusing at a glance.
2020-04-22 21:01:43 +01:00
Merry
9a4e3b24e4 Merge pull request #467 from lioncash/reserved
A64: Handle reserved instruction cases more specifically where applicable
2020-04-22 21:01:43 +01:00
Merry
0b794cbcea Merge pull request #466 from lioncash/fcmla
A64: Implement FCMLA's indexed element variant
2020-04-22 21:01:43 +01:00
Merry
994349d154 Merge pull request #465 from neobrain/master
CMakeLists: Allow importing dynarmic build trees into other CMake projects
2020-04-22 21:01:43 +01:00
Lioncash
cfd7513a7d ir_opt/verification_pass: Mark locals as const where applicable
Makes our immutable state a little more explicit.
2020-04-22 21:01:40 +01:00
Lioncash
8309d49588 A64: Handle reserved instruction cases more specifically where applicable
These are cases that are defined as reserved within the ARMv8 reference
manual, so we can handle them as such instead of as unallocated
encodings.

While this doesn't actually change emulated behavior, it does at least
allow the JIT to generate the more appropriate exception.
2020-04-22 21:00:47 +01:00
Lioncash
6c2c68bce6 A64: Implement FCMLA's indexed element variant
With this, all of the instructions introduced with ARMv8.3-CompNum have
an implementation.
2020-04-22 21:00:47 +01:00
Tony Wasserka
7d99a6c00f CMakeLists: Allow importing dynarmic build trees into other CMake projects 2020-04-22 21:00:47 +01:00
Lioncash
1a45f35b9c ir_opt/a64_callback_config_pass: Mark locals as const where applicable
Makes our immutable state a little more explicit.
2020-04-22 21:00:47 +01:00
Lioncash
7bc7042104 simd_scalar_shift_by_immediate: Change UnallocatedEncoding() path in SaturatingShiftLeft to ReservedValue()
Strictly speaking, immh being zero is defined as reserved in the ARMv8
reference manual. This was just an error on my part when introducing the
SQSHL immediate scalar variant.
2020-04-22 21:00:47 +01:00
Lioncash
dc97977576 ir_opt/a32_get_set_elimination_pass: Mark local variables as const where applicable
Makes our intended immutable state slightly more explicit.
2020-04-22 21:00:47 +01:00
Lioncash
b1b4487e4d A64: Implement UQSHL (immediate)'s scalar variant
Like SQSHL's immediate scalar variant, we can also implement UQSHL's
immediate scalar variant in terms of the vector variant for the time
being.
2020-04-22 21:00:47 +01:00
Lioncash
3649dc6d9a A64: Implement scalar variant of SQSHL (immediate)
This can be handled in terms of the vector variant for the time being.
2020-04-22 21:00:47 +01:00
Lioncash
7d535eaba6 ir_opt/a32_constant_memory_reads_pass: Apply const where applicable to locals
Makes immutable state just slightly more explicit.
2020-04-22 21:00:47 +01:00
Lioncash
e1b4ff1068 simd_scalar_shift_by_immediate: Migrate SQSHL implementation to file-scope function
This will allow it to be reused for the implementation of UQSHL.
2020-04-22 21:00:47 +01:00
Lioncash
b37279f65c backend/x64/emit_x64_vector: Prevent undefined behavior within VectorSignedSaturatedShiftLeft
Avoids undefined behavior by potentially left-shifting a signed negative
value.
2020-04-22 21:00:47 +01:00
Lioncash
46eae8cf2f common/fp/op/FPRecipExponent: Prevent undefined behavior from shifting a negative value
Due to promotion rules (types < int, even if unsigned, get promoted to
int when arithmetic is performed on them), this is a potential spot for
undefined behavior.
2020-04-22 21:00:47 +01:00
MerryMage
0066ad2d38 Squashed 'externals/fmt/' changes from 3e75ad982..9e554999c
9e554999c Update version
b34d92b05 Bump version
d39ece187 Make rst2md runnable and update changelog
fe2d715ff Update changelog
27b306701 Update changelog
68837079a Update changelog
c98b202eb Update changelog
587a7f663 Update changelog and docs
84e5170c9 Update changelog and deprecate visit
130e412b6 Update changelog and docs
0bbdca5b8 Fix conversion warnings (#989)
77a724480 Implement fill/align/width for strftime-like formatting
3e01376e0 Implement fill/align/width parsing in chrono formatter
1f92f8a9d Remove noexcept
8668639ae Get rid of null_terminating_iterator in format
93fd473b8 Add support for builtin terminal colors. (#974)
61ad543c3 Windows .sln filename changed from FORMAT to FMT
7f7504b3f Clean up docs
37f599b1a Fix docs
8c2e15aed Make printf work in search (#164)
de71db6d4 Fix asan error (#977)
b180b3915 Fix default formatting
24594c747 Disable printing the reset escape code when no style modifiers where applied. (#973)
b0f222471 Implement default chrono formatting
749276072 Add file stream support for stylized text printing. (#967)
f54f3d0fb Move chrono-specific code to a separate header
bf1f1c73e Fix time test
b6bc6ec24 Add default ctor and fix use of constexpr macros in text_style
acfa95d4a Workaround a bug in MSVC's strftime (#965)
628f83058 More chrono formatting
aa3b5aba4 Implement locale-specific minute formatting
639de2175 Workaround more MSVC bugs
3242ddf7b Fix warnings
bd1104046 Workaround a bug in MSVC
81b5c4a5f Add experimental emphasis support (#961)
7c4eb0fbe Fix warnings in time.h
2d624218b Fix another warning
b31680990 Fix a warning
b10ccb83e Add rpclib to projects
0497875ff Stop the orgy of casts
37dc495b9 Simplify MSVC workaround
2ff4996d0 Fix ambiguous complier error C2666 in vs2017.The '+' opeator may cause ambiguity.Avoid implicit conversion.
77656c672 Fix sign-conversion warnings reported by Clang7
ea5e4790b Fix formatting
86681c4bb Update README.rst
e867768ee Do not override user provided compile flag
0c7f5c3ca Update README.rst
e7e2ab107 Make return type of basic_format_args::max_size() consistent.
29352af36 Update README.rst
68214bd90 More time formatting
bcf3fcd67 Clean up bit fiddling for argument packing
9dcf127fa Workaround a bogus MSVC warning
b8b06e3e1 Fix conversion warnings in Grisu
322b2594e Implement more time specifiers
0835f1ba3 Use full paths for fmt.pc.in
a084495d7 Add Ceph to projects
fa1d4dbcf Fix warnings
2b2cfdac1 Update docs
99744f8f8 Suppress unfixable warning
f5fe84923 Specialize formatter for chrono durations
a5a9805a9 First stub at the datetime format parser
645c76a9a Fix dummy warnings
fecb2d6f0 Eliminate msvc compiler warnings (#931)
64690d3a9 Add context_base::arg()
01640f44c Fully qualify dummy_int (#941)
e37d6a984 add make_printf_args and make_wprintf_args functions (#934)
982ee5c69 parse_context -> format_parse_context
b7b854855 thousands_sep -> thousands_sep_impl (#939)
00a8cc832 Fix formatting
33fbb3a7e Fix remaining linker errors.
bd6121596 Disable fmt-impl-test in windows + shared lib.
702b3d161 Fix link error in windows with shared library.
9d4ef9435 Install pdb files.
6c95fb356 Default Context to format_context
16b78ee62 fix incompatibilities with c++2a mode in clang
19e008876 More locale support
f2ee98810 Improve locale support
1385050e2 More formatter tests
03c1b110a Fix gcc 4.4 build
cc805c616 Test enabled formatters
e01579231 Disallow leading zeros in arg-id
34030deca Cleanup warning flags
6b26e3f2d Manifest & Gradle comment
d286c9775 Update for Gradle build
d951f6dfe Get latest Gradle (ver. 4.10.2)
a23d59247 Fix check_format_string (#925)
36161284e Update docs
38f355d87 Revert "find sphinx-build before calling build.py"
324eac1aa Make locales work with any character type
bdda4d603 Simplify compile-time strings
5ee1a4bc8 check for property 'mutable iterator' and SFINAE on it
2dea780fb change type naming and fix sfinae bug
b98e8301d add non-char support for compile-time format check
ccd3e8bbf Make is_constructible public (#918)
437315380 Update usage.rst
73cfd8f32 Fix colored print
ec384302d additional test for print with background color
0a96c032b Parameterize v*printf on string type (#920)
61e6d2e38 Fix core version of vformat_to
ea4010d70 Merge has_to_string_view into is_string
486fff597 Add sprintf_format instantiations and remove syntactic noise
1e3dcbba8 fix: 'format_to_n' compiles 'std::back_inserter' arguments
f0328f8e3 Use char_traits::length in string_view ctor (#914)
895fb9845 Disallow gcc 4.4 failures
20c708bf6 Fix build on gcc 4.4
9d0c9c4bb cmake: output share/fmt.pc
2d2326a76 Fix compilation with older gcc
1ec027230 Get rid of FMT_UNION
2c81c851b Adapt any string-like type to be used by {fmt} just like the standard string types already supported. The adaption is totally non-intrusive.
846c644e8 Workeround broken sprintf in MSVC
13d472bd8 Compute output size for grisu
b71d3fe7a Remove use_grisu
847abb6f8 Fix test
dda47c946 Merge min_digits and max_digits
292462215 Fix naming of basic_format_specs members
bda5f9a55 Replace grisu2_specs with core_format_specs
b1ca608ba Remove unused empty_spec
e8efdef8d Avoid extra copy
98f1c1fe8 Remove unused code
50b18a3c1 Integrate Grisu
699297520 Implement Grisu rounding
4bb76ef0c Remove redundant definition of print
ddd7caf38 Fix locale-dependent formatting (#905)
10e03e695 use found python executable for launching sphinx-build
07200f445 find sphinx-build before calling build.py
08a65c228 Workaround broken constexpr in MSVC2017
167f8fe32 Fix a typo in api.rst
57983423c Remove signbit workaround
7bebb3e12 Clarify overload resolution in docs
939fbe556 Remove basic_fixed_buffer.
61f81a071 minor documentation corrections
f27defc63 Parameterize printf functions on the type of the format string.
6a685571d Make 'std::*::basic_string_view' a valid argument type for 'format_str' parameters.
87a0408c6 Fix ostream.h build
2b5acad4a Remove redundant size argument to write_padded
655ce5338 is_format_string -> is_string
fea712abb Parameterize ostream functions on the type of the format string.
f16a118e8 Fix non-matching char types.
041bf83d9 Improve fmt::format readability
229903239 Document how to write a formatter for a type hierarchy
f5480635c visit -> visit_format_arg
cdf3fa08d Put related code together in fmt/core.h
38325248e Count width in code points (#628)
deb901b9e Parameterize core functions on the type of the format string.
0f98de301 Update docs
c797708fc Workaround strlen being non-constexpr in ARM toolchain
49b4c1e9d Update docs
63a87beba Add to_string_view
4e0c31465 checked_format_args -> checked_args
c3538a1ee Simplify variadic functions further
2d7d0835d Simplify variadic functions
3f4cfa6c6 Implement UTF-8 string support
f8027414f Impelement char8_t support
76a47d41c Cleanup the use of FMT_CHAR
267fdc7a1 Parameterize core functions on the type of the format string.
5bced1242 Parameterize more functions on string type
674999c52 fix vs2017 warning fmt::v5::localtime 'not all control paths return a value'.
e4fea22d1 Make char8_t a strongly-typed enum
66992e90d Clarify that writing to memory_buffer appends (#877)
e864acfdb Fix compilation with intel compilers (ICC/ICPC) v14.0
4cf21f58b constrain templated format_to on proper format string type.
d7f17613f Fix compilation on platforms with exotic double (#878)
e4ca37ccf Parameterize format_to on string type (#880)
d66fa2216 Reduce syntactic noise
48e6dcd0f Implement workarounds for gcc 4.4
0ea3221d3 Remove is_named_arg and add FMT_CHAR
73c53d783 Parameterize 'printf(rgb color, ...)' and 'vprint_rgb(rgb color, ...)' on the type of the format string.
d41be23ac Simplify string_view detection
2def9e4c8 Remove FMT_DTOR_NOEXCEPT
ff6e46ed9 More cleanup
715f2b4c0 Remove require_wchar and internalize no_formatter_error
ec0cdc46f Workaround Windows slowness

git-subtree-dir: externals/fmt
git-subtree-split: 9e554999ce02cf86fcdfe74fe740c4fe3f5a56d5
2020-04-22 21:00:18 +01:00
MerryMage
13e8b7b516 emit_x64_floating_point: F16C implementation of FPSingleToHalf 2020-04-22 20:58:17 +01:00
MerryMage
d32d6fe598 emit_x64_floating_point: F16C implementation of FPHalfToSingle and FPHalfToDouble 2020-04-22 20:58:12 +01:00
MerryMage
a53ba12be2 emit_x64_floating_point: Factor out ConvertRoundingModeToX64Immediate 2020-04-22 20:58:12 +01:00
MerryMage
5a2adc6629 backend/x64: Expose FPCR in EmitContext instead of its subcomponents 2020-04-22 20:58:12 +01:00
Merry
01bb1cdd88 Merge pull request #458 from lioncash/float-op
A64: Handle half-precision floating point in FABS, FNEG, and scalar FMOV
2020-04-22 20:58:12 +01:00
Lioncash
28a8b4d210 A64: Handle half-precision floating point in scalar FMOV
This is simply performing a scalar value transfer between registers
without conversions, so this is trivial to handle as-is.
2020-04-22 20:58:12 +01:00
Lioncash
d7ac5a664f A64: Handle half-precision floating point in FCVTL
Like FCVTN, now that we have half-precision floating point conversion
functions available, we can go ahead and use those to eliminate the
interpreter fallback.
2020-04-22 20:58:12 +01:00
Lioncash
fe84ecb780 A64: Handle half-precision floating point in scalar FABS
Now that we have the half-precision variant of the opcode added, we can
simply handle the instruction instead of treating it as undefined.
2020-04-22 20:58:12 +01:00
Lioncash
fac9224d5e A64: Handle half-precision floating point in FCVTN
Now that we have IR instructions for performing conversions with
half-precision floating point, we can also handle half-precision values
within FCVTN.
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f frontend/ir_emitter: Add half-precision variant of FPAbs 2020-04-22 20:58:12 +01:00
Lioncash
16de99d3e3 A64: Enable FCVT floating-point conversions for half-precision
With this, we no longer have to fall back to the interpreter in any of
the FCVT floating-point conversion instructions.
2020-04-22 20:58:12 +01:00
Lioncash
10abc77fad A64: Handle half-precision floating point in scalar FNEG
With the half-precision variant of the FPNeg opcode added, we can
utilize it here to emulate the half-precision variant of FNEG.
2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes 2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978 frontend/ir_emitter: Add half-precision variant of FPNeg 2020-04-22 20:58:12 +01:00
Lioncash
dff5da1063 common/fp/unpacked: Amend behavior of FPUnpackCV
This is supposed to call FPUnpackBase instead of FPUnpack. This would
result in alternate half-precision representations being misinterpreted
when it comes to dealing with NaNs.
2020-04-22 20:58:12 +01:00
Merry
f01afc5ae6 Merge pull request #456 from lioncash/mov
A64: Enable FMOV (general) for half-precision floating point
2020-04-22 20:58:12 +01:00
Lioncash
03bc2334fe common/fp/op/FPConvert: Amend off-by one in double NaN case in FPConvertNaN
Avoids potentially clobbering the intended sign bit value during
conversions to double-precision values. The other conversion types are
already properly handled, so those don't need to be addressed.
2020-04-22 20:58:12 +01:00
Lioncash
c57b146fb2 common/fp/op/FPConvert: Add half-precision instantiations to FPConvert 2020-04-22 20:58:12 +01:00
Merry
c1ce94872d Merge pull request #455 from lioncash/sqrdmulh-scalar
A64: Implement SQRDMULH and SQDMULL's scalar indexed variants
2020-04-22 20:58:11 +01:00
Lioncash
25a7256ee1 A64: Enable FMOV (general) for half-precision floating point
This just transfers values between vector registers and general-purpose
registers with no conversions performed, so this is trivial to add
support for half-precision to.
2020-04-22 20:58:11 +01:00
Lioncash
97dd3d0596 A64: Implement SQRDMULH's scalar indexed element variant 2020-04-22 20:58:11 +01:00
Lioncash
49b51e34f1 simd_vector_x_indexed_element: Deduplicate index and Vm operand construction 2020-04-22 20:58:11 +01:00
Lioncash
692aba91b6 A64: Implement SQDMULL{2}'s scalar indexed element variant 2020-04-22 20:58:11 +01:00
Lioncash
c043b831d5 A64: Implement SQDMULL{2}'s by-element variant 2020-04-22 20:58:11 +01:00
Lioncash
72af5a3dff simd_scalar_x_indexed_element: Factor out index and Vm argument construction
This will be useful in the implementations of SQRDMULH and SQDMULL{2} as
well.
2020-04-22 20:58:11 +01:00
Lioncash
224ff0afaa A64: Implement SQRDMULH's by-index vector variant 2020-04-22 20:58:11 +01:00
Lioncash
3a3542414b A64: Implement FRECPX's half-precision floating point variant 2020-04-22 20:58:11 +01:00
Lioncash
bd892ec4ef frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point 2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677 frontend/ir/value: Add U16U32U64 type to represent floating point types 2020-04-22 20:58:11 +01:00
Lioncash
eb3e0d5908 common/fp/op/FPRecipExponent: Add half-precision floating point specialization 2020-04-22 20:58:11 +01:00
Lioncash
a829c93406 common/fp/unpacked: Correct edge-cases within FPUnpack for half-precision floating point
This corrects one case where floating-point exceptions could be set when
they're not supposed to be.

This also corrects a case where values were being treated as NaNs when
they weren't supposed to be.
2020-04-22 20:58:11 +01:00
Lioncash
7030b9af95 common/fp/process_nan: Add half-precision instantiations for NaN processing functions 2020-04-22 20:58:11 +01:00
Lioncash
14f55d7476 common/fp/unpacked: Add half-precision instantiation of FPRoundBase 2020-04-22 20:58:11 +01:00
Lioncash
7e814de445 common/fp/unpacked: Handle half-precision unpacking in FPUnpackBase 2020-04-22 20:58:11 +01:00
Lioncash
8f9fe8690a common/fp/unpacked: Adjust FPUnpack to operate like ARM pseudocode
This function is defined as always disabling the AHP bit in the fpcr
before performing any operations.

At the same time, rename the original FPUnpack function to FPUnpackBase
to match the pseudocode in the ARM reference manual.
2020-04-22 20:58:11 +01:00
Merry
37c4c39d62 Merge pull request #448 from lioncash/saturate
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
2020-04-22 20:58:11 +01:00
Merry
f5d774bdbd Merge pull request #449 from lioncash/hp
common/fp/info: Add specialization of FPInfo for half-precision floating point
2020-04-22 20:58:11 +01:00
Lioncash
126c29a9e9 A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
These can just be implemented in terms of the vector variants for the
time being.
2020-04-22 20:58:11 +01:00
Lioncash
0b67b94b6c common/fp/info: Add specialization of FPInfo for half-precision floating point
Puts the necessary info struct in place for further use.
2020-04-22 20:58:11 +01:00
Lioncash
dd7433f9d3 A64: Amend prototypes of some SIMD scalar shift by immediate opcodes
These take a vector for a destination.
2020-04-22 20:58:11 +01:00
Lioncash
99c494bae9 common/fp/unpacked: Add FPRoundCV
Corresponds to the equivalent pseudocode within the ARMv8 reference
manual. This will be necessary for supporting half-precision
floating-point.

This also makes use of it within FPConvert
2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2 Merge pull request #447 from lioncash/flag
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Lioncash
490bebbd9a common/fp/unpacked: Add FPUnpackCV
Adds a template function that performs the same behavior as in the ARM
pseudocode, and utilizes it in FPConvert, which will be necessary for
half-float support.
2020-04-22 20:58:11 +01:00
Merry
fb039e232c Merge pull request #442 from lioncash/fcvtxn
A64: Implement scalar and vector variants of FCVTXN
2020-04-22 20:58:11 +01:00
Lioncash
6aed4036ef ir_opt/a64_get_set_elimination_pass: Add handling for NZCV raw get and set operations 2020-04-22 20:58:11 +01:00
Merry
4f937c1ee1 Merge pull request #446 from lioncash/sqshl
A64: Implement scalar variants of SQSHL (register) and UQSHL (register)
2020-04-22 20:58:11 +01:00
Lioncash
aa22db534b A64: Implement AXFlag and XAFlag 2020-04-22 20:58:11 +01:00
Merry
d74cccbc84 Merge pull request #445 from lioncash/sqrt
A64: Implement single and double-precision vector variant of FSQRT
2020-04-22 20:58:11 +01:00
Lioncash
20ffe568d0 A64: Implement RMIF 2020-04-22 20:58:11 +01:00
Merry
6d7e7c3269 Merge pull request #443 from lioncash/flag
A64: Rearrange flag format/manipulation instructions
2020-04-22 20:58:11 +01:00
Lioncash
51b526e453 A64: Implement CFINV 2020-04-22 20:58:11 +01:00
Merry
5d01f1b462 Merge pull request #441 from lioncash/constexpr
common/bit_util: Mark a few functions as constexpr
2020-04-22 20:58:11 +01:00
Lioncash
597a8be5d5 ir: Add A64-specific opcodes for getting and setting raw NZCV values
This will be necessary to implement the flag manipulation and flag
format instructions.
2020-04-22 20:58:11 +01:00
Merry
743c52fdc5 Merge pull request #440 from lioncash/include
common/fp: Remove unnecessary includes
2020-04-22 20:58:11 +01:00
Lioncash
d3515279df A64: Implement the vector version of FCVTXN 2020-04-22 20:58:10 +01:00
Lioncash
17aea0b997 A64: Implement UQSHL (register)'s scalar variant
This can be implemented in terms of the vector variant.
2020-04-22 20:58:10 +01:00
Lioncash
c99d4b762e A64: Implement single and double-precision vector variant of FSQRT 2020-04-22 20:58:10 +01:00
Lioncash
54e0b487f3 A64: Rearrange flag format/manipulation instructions
Gives these instructions better categorical labeling.
2020-04-22 20:58:10 +01:00
Lioncash
88d1977cb9 common/bit_util: Make a few functions as constexpr
These four functions can be made constexpr with no issue.
2020-04-22 20:58:10 +01:00
Lioncash
f33e5939b7 common/fp: Remove unnecessary includes 2020-04-22 20:58:10 +01:00
Lioncash
302f56b36a A64: Fall back to interpreting for FCADD and FCMLA half-precision variants
Rather than straight-up treating them as undefined, we can fall back to an
interpreter in this case.
2020-04-22 20:58:10 +01:00
Lioncash
4339a8fff6 A64: Implement the scalar version of FCVTXN 2020-04-22 20:58:10 +01:00
Lioncash
35ddf68ad5 A64: Implement SQSHL (register)'s scalar variant
We can implement this in terms of the vector variant.
2020-04-22 20:58:10 +01:00
Lioncash
5cf1478620 frontend/ir: Add opcodes for vector square roots 2020-04-22 20:58:10 +01:00
Lioncash
36027ebef5 frontend/ir/microinstruction: Add missing cases for FPRecipExponent{32,64} for ReadsFromAndWritesToFPSRCumulativeExceptionBits()
This was intended to be added within #437, but was missed
2020-04-22 20:58:10 +01:00
Merry
40b081438a Merge pull request #439 from lioncash/fcmla
A64: Implement FCADD and FCMLA
2020-04-22 20:58:10 +01:00
Lioncash
7c81a58ed3 frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode
This will be necessary to special-case the non-IEEE Von Neumann rounding
to odd rounding mode.
2020-04-22 20:58:10 +01:00
Merry
d91192681a Merge pull request #438 from lioncash/fmulx
A64: Implement scalar double/single precision FMULX (by element)
2020-04-22 20:58:10 +01:00
Lioncash
ed29ef8cca A64: Implement FCMLA 2020-04-22 20:58:10 +01:00
Lioncash
95af9dafbe common/fp/op: Add FP conversion functions 2020-04-22 20:58:10 +01:00
Merry
9f11720a69 Merge pull request #437 from lioncash/frecpx
A64: Implement FRECPX (single, double precision)
2020-04-22 20:58:10 +01:00
Lioncash
bdcea0b0dc A64: Implement scalar double/single precision FMULX (by element) 2020-04-22 20:58:10 +01:00
Lioncash
5ce17574f9 A64: Implement FCADD 2020-04-22 20:58:10 +01:00
Merry
34d917f34e Merge pull request #436 from lioncash/no-alloc
A64: Implement LDNP/STNP
2020-04-22 20:58:10 +01:00
Lioncash
e44730ba6d A64: Implement FRECPX (single, double precision) 2020-04-22 20:58:10 +01:00
Lioncash
bfaeb08d3c A64: Implement LDNP/STNP
LDNP and STNP indicate that a memory access is non-temporal/streaming
(i.e. unlikely to be repeated), allowing data caching to not be
performed. However, given this is only a hint, we can treat these two
instructions as regular LDP and STP instructions for the time being.
2020-04-22 20:58:10 +01:00
Lioncash
9cf3c25811 frontend/ir/ir_emitter: Add opcodes for floating point reciprocal exponents 2020-04-22 20:58:10 +01:00
Merry
dbf47db713 Merge pull request #434 from lioncash/format
A32/translate_arm: Formatting/tidying up
2020-04-22 20:58:10 +01:00
Lioncash
b168c2a9f9 common/fp/op: Add operations for floating-point reciprocal exponents 2020-04-22 20:58:10 +01:00
Lioncash
05a6ab691d translate_arm/coprocessor: Minor tidying up 2020-04-22 20:58:10 +01:00
Lioncash
1e32a09c03 translate_arm/vfp2: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
e209b31073 translate_arm/synchronization: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
9514e3602e translate_arm/status_register_access: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
c6aa1a708a translate_arm/saturated: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
a72813599a translate_arm/reversal: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
7be56e6b67 translate_arm/parallel: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
3c00a616d6 translate_arm/packing: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
c711188f46 translate_arm/multiply: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
c8dad40d81 translate_arm/misc: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
a7bf5ff77d translate_arm/load_store: Invert conditionals where applicable 2020-04-22 20:58:10 +01:00
Lioncash
2e180a7f14 backend/x64/a32_interface: Mark Context move constructor and move assignment as noexcept
Provides a more "correct" move constructor/assignment operator, since
these relevant functions shouldn't throw exceptions.

Has the benefit of playing nicely with std::move_if_noexcept and other
noexcept library facilities.
2020-04-22 20:58:09 +01:00
Lioncash
f4b19a7393 translate_arm/extension: Invert conditionals where applicable 2020-04-22 20:58:09 +01:00
Lioncash
deb9dd4acc block_of_code: Replace cast with [[maybe_unused]] in DoesCpuSupport() 2020-04-22 20:58:09 +01:00
Lioncash
c2de6ecfd0 translate_arm/exception_generating: Invert conditionals where applicable 2020-04-22 20:58:09 +01:00
Lioncash
d8a8d3b073 translate_arm/data_processing: Invert conditionals where applicable 2020-04-22 20:58:09 +01:00
Lioncash
df5c51ff47 translate_arm/branch: Invert conditionals where applicable
Allows unindenting code a bit.
2020-04-22 20:58:09 +01:00
Lioncash
3290a9fdc2 common: Remove address_range.h
The AddressRange structure isn't used anywhere within the codebase, so
this can be removed. Particularly because there's no real appeal/heavy
potential use of it in the future that isn't trivial to add back if
needed.
2020-04-22 20:57:38 +01:00
Lioncash
ee973f13c7 frontend/A32/ir_emitter: Mark PC() and AlignPC() as const-qualified member functions
These don't modify instance state, so they can be const-qualified member
functions.
2020-04-22 20:57:38 +01:00
Lioncash
3a2dd09122 frontend/A64/ir_emitter: Mark PC() and AlignPC() as const qualified member functions
These don't actually alter any instance state.
2020-04-22 20:57:38 +01:00
Lioncash
575ae852a9 constant_propagation_pass: Fold byte reversal opcodes where applicable
These are reasonably trivial to fold away when applicable. We just
perform the swap and replace the instruction with the constant value.
2020-04-22 20:57:37 +01:00
Merry
2c53f354ab Merge pull request #418 from lioncash/fold-op
constant_propagation_pass: Handle folding for Least/MostSignificant{Bit, Byte, Half, Word} opcodes
2020-04-22 20:57:37 +01:00
Merry
ad14a33672 Merge pull request #417 from lioncash/swap
common: Move byte swapping functions to bit_utils.h
2020-04-22 20:57:37 +01:00
Lioncash
d302d9bd0c constant_propagation_pass: Handle folding for Least/MostSignificant{Bit, Byte, Half, Word} opcodes
These are quite trivial to fold.
2020-04-22 20:57:37 +01:00
Lioncash
7139942976 common: Move byte swapping functions to bit_utils.h
These are quite general functions, so they can just be moved into common
instead of recreating a namespace here.
2020-04-22 20:57:37 +01:00
MerryMage
7c8fcaef26 emit_x64_vector_floating_point: AVX && DN implementation of EmitFPVectorMulX 2020-04-22 20:57:37 +01:00
MerryMage
e3898e628e A64: Implement FMULX (by element), single and double precision variants 2020-04-22 20:57:37 +01:00
Lioncash
93351c7efb a64_emit_x64: Make constness of loop elements explicit within GenFastmemFallbacks() 2020-04-22 20:57:37 +01:00
MerryMage
c106d8cedf A64: Implement FMULX, vector single-precision and double-precision variant 2020-04-22 20:57:37 +01:00
Lioncash
7752ffc50c a64_emit_x64: Convert std::vector instances in GenFastmemFallbacks() to std::array
Given these are quite small, we can avoid the need to heap allocate
here.
2020-04-22 20:57:37 +01:00
MerryMage
fa8925c4df IR: Implement FPVectorMulX 2020-04-22 20:57:37 +01:00
Michał Janiszewski
bbd8abaa25 Provide justification for always-true condition (#412) 2020-04-22 20:57:37 +01:00
Michał Janiszewski
7d0e918b51 Add missing include guards 2020-04-22 20:57:37 +01:00
V.Kalyuzhny
764a93bf5a Switch boost::optional to std::optional 2020-04-22 20:57:37 +01:00
Lioncash
07c197e8d0 constant_propagation_pass: Add 64-bit variants of shifts to the pass
These optimizations can also apply to the 64-bit variants of the shift
opcodes; we just need to check if the instruction has an associated
pseudo-op before performing the 32-bit variant's specifics.

While we're at it, we can also relocate the code to its own function
like the rest of the cases to keep organization consistent.
2020-04-22 20:57:37 +01:00
Lioncash
8248999c5d constant_propagation_pass: Fold division operations where applicable
We can fold division operations if:

1. The divisor is zero, then we can replace the result with zero (as this is how
ARM platforms expect it).
2. Both values are known, in which case we can just do the operation and
store the result
3. The divisor is 1, in which case just return the other operand.
2020-04-22 20:57:37 +01:00
Merry
d83eae2004 Merge pull request #406 from lioncash/mul
constant_propagation_pass: Fold Mul32 and Mul64 cases where applicable
2020-04-22 20:57:37 +01:00
Merry
73d9393300 Merge pull request #405 from lioncash/inst
a64: Add ARMv8.4+ instructions encodings to the encoding table
2020-04-22 20:57:37 +01:00
Lioncash
7ad6981437 constant_propagation_pass: deduplicate common 32/64 bit checking for results in folding functions
It's common for an folding operation to apply to both the 32-bit and
64-bit variant of the same opcode, which leads to checking which kind of
result we need to store the value as. This moves it to its own function,
so that we don't need to duplicate it in various functions.
2020-04-22 20:57:37 +01:00
Lioncash
f1a66c37ba a64: Add ARMv8.4+ instructions encodings to the encoding table
Keeps the table up to date with the ARM specification.
2020-04-22 20:57:37 +01:00
Lioncash
72daf37208 constant_propagation_pass: Fold Mul32 and Mul64 cases where applicable
Multiplication operations can currently be folded if:

1. Both arguments are known constant values
2. Either operand is zero (in which case the result is also zero)
3. Either operand is one (in which case the result is the non-one
operand).
2020-04-22 20:57:37 +01:00
Lioncash
43b2eb4688 constant_propagation_pass: Fold SignExtend{Type}ToLong opcodes if possible 2020-04-22 20:57:37 +01:00
Lioncash
2da2cf9058 constant_propagation_pass: Fold SignExtend{Type}ToWord opcodes if possible 2020-04-22 20:57:37 +01:00
Lioncash
0583d401e3 ir/value: Add IsSignedImmediate() and IsUnsignedImmediate() functions to Value's interface
This allows testing against arbitrary values while also simultaneously
eliminating the need to check IsImmediate() all the time in expressions.
2020-04-22 20:57:37 +01:00
Lioncash
c42f6ea184 constant_propagation_pass: Fold ZeroExtend{Type}ToLong opcodes if possible
These are equivalent to the ZeroExtendXToWord variants, so we can
trivially do this as well.
2020-04-22 20:57:37 +01:00
Lioncash
e3258e8525 ir/value: Add a GetImmediateAsS64() function
Provides a signed analogue to GetImmediateAsU64() for consistency with
both integral classes when it comes to signed/unsigned..
2020-04-22 20:57:37 +01:00
Lioncash
2274214ff0 constant_propagation_pass: Combine zero-extension folding code into its own function
Separates the behavior from the actual switch statement and gets rid of
duplication, now that we can use the general GetImmediateAsU64()
function.
2020-04-22 20:57:37 +01:00
Lioncash
4a3c064b15 ir/value: Add an IsZero() member function to Value's interface
By far, one of the most common things to check for is whether or not a
value is zero, as it typically allows folding away unnecesary
operations (other close contenders that can help with eliding operations  are 1 and -1).

So instead of requiring a check for an immediate and then actually
retrieving the integral value and checking it, we can wrap it within a
function to make it more convenient.
2020-04-22 20:57:37 +01:00
MerryMage
5f7df9a182 Squashed 'externals/fmt/' changes from 135ab5cf..3e75ad98
3e75ad98 Update version
4f043f8e Bump version
cc02cbc4 Fix formatting
73c0238e Update changelog
cb122a4d Fix format_to formatting to wmemory_buffer
dc69cc45 Clean tests
9d8021f0 Add checks for NVIDIA's CUDA compiler
9d2221b9 Improve error message when formatting unknown types
70a6a4bb prevent ""fmt/range.h"" from specializing fmt::basic_string_view (#865)
e4fc856c Disable android build due to gradle issues
3f4984fb Clean core-test and fix linkage errors on older gcc
d4366505 Workaround visit lookup issues in printf.h on gcc
894b6fac Changed to use scoped enum
59f555ad Workaround more visit lookup issues on gcc
a7e356cc Update README.rst
e758bfba Merge branch 'release' of github.com:fmtlib/fmt
66381e30 Minor cleanup
295a0d84 Update version
1fb1c4c9 Update docs
465a5935 Add table support to rst2md
d62f4c3b Formatting
a243490a Add more methods to benchmark results
9e12ca60 Update changelog
fbca830d Update changelog, readme and improve compat
6146248c Update changelog
bc26fbf1 Move experimental color API to fmt/color.h
97cc8893 Workaround a visit lookup issue in gcc 8 (#851)
7110b460 Optimize default formatting
c8a8464f Optimize buffer construction
8cbfb6e7 Get rid of conversion warning in gcc-4.8 (#854)
6ffc828a Phasing out null_terminating_iterator
aeb6add3 Skip strchr for the common case
5614289d Optimize and simplify format string parsing
10c7f893 Optimize format string processing on dumb compilers
59c268a5 Use strlen when possible since it's constexpr on gcc
918bb1ce Optimize argument capture
a3ba6b4f Disable the fmt(...) macro by default (#853)
86716894 Update docs and formatting
cc10b460 Make format_to faster on older gcc
981797f0 Get rid of implicit-fallthrough warn. in GCC 7 and 8
21177757 Micro-optimize parsing
be0e2684 Optimize processing of trailing '}'
fbc38b90 Pass heavy arguments by ref
8dc69b9d Workaround a bug in Intellisense
1489d3b7 Implement exponential notation
dd8c5ce4 Implement more FP formatting options
46484da7 Fix a warning
802ff886 Fix compilation of time.h when localtime_t is a macro (#843)
95a71899 Remove conversion compiler warnings (#844)
e483a01a Implement some formatting options in Grisu
f5108091 Revert "Implement some formatting options in Grisu"
2a952dd0 Implement some formatting options in Grisu
0de44a46 Implement exponent formatting
f0d0a1eb Implement Grisu2 digit generation
569ac91e Implement Grisu boundary computation
a11eb3a0 Workaround various icc bugs (#822)
62010520 Disable gnu-string-literal-operator-template warning
98751476 Make convert_to_int public (#818)
ba95e36a Clarify that '\0' cannot be used as fill (#832)
abde38b4 Add compilation support with Newlib nano for embedded targets
18400503 Fix C4127 warning in basic_writer<Range>::write_double
9de31211 Reformat and add a comment
8bbb0b48 Update README.rst
5c0101ab Use the correct function signature in the docs
fbe6410e Fix docs
8b9fb9fb Fix ambiguous instantiation with formatter in fmt/ostream.h (#830)
0f04ec68 Fix package upload (#828)
80907385 Update changelog
5d02041c Update changelog
4b868b89 Re-enable compile-time format-string checking
4061a0d3 Parameterize vformat to support custom char types
c68bab70 Remove broken fmt::internal::format_enum (#818)
0c63d15e Improve wording
ce19309d Workaround a bug in icc 15
c6843491 Move contiguous version of format_to to fmt/core.h
8db14efa util-test -> core-test and minor cleanup
ffe414ca Add compile-time format string checks to format_to (#783)
c178ab44 Remove FMT_USE_RVALUE_REFERENCES
5befe658 Remove fmt/folly.h and clean up core API
35538ca6 Merge more format overloads
4f164097 Merge format overloads using SFINAE
2a4e9488 Add UTF-8 types
d778bded Make line in tests fit within 80chars
7b4f170c Fix warning about using old-style cast
b1d10a28 Add support for dynamic arg sets
cf2719bd Add support for types explicitly convertible to wstring_view
50584f42 Test formatting of an object with templated conversion to string-like
73bed45b Add support for types explicitly convertible to fmt::string_view
6eaa5074 Fix global initialization issue (#807)
48dff9f3 Update docs
a9e26159 Minor cleanup
efd8ee8a Reduce warnings, support #809
8615ff2a Micro-optimize argument retrieval
916ed99d Micro-optimize argument retrieval
e7e9578e Optimize format string parsing
c99a2597 Mark new functions with FMT_API (#808)
e0f6a2f8 Add a formatter for folly::StringPiece
ae4a3945 Revert "Better support for newer CMake's"
a317448b Keep noexcept specifier when exceptions are disabled.
0eb01b83 Better support for newer CMake's
2a4cd6d0 Fix the returned value of `format_to_n` with user-defined types having operator<<.
9c32e73a Fixing return unreachable warning on NVCC
e5c93108 Added clear() to basic_buffer
60c662b3 Add an example of reusing formatters
f66ba650 Optimize format string parsing
f21268aa Revert "Optimize format string parsing" because of a bug in MSVC
07b690a6 Update README.rst
f9e9bf02 Optimize format string parsing
c2ce7e4f Update version
434eb916 Update README.rst
09d94162 Update changelog
e6362642 Fix pedantic conversion warning
f0110e81 Update changelog and CI
479ee2a8 Fix MSVC build, take 2
e928b672 Fix MSVC 2013 build
ec218a3a Fix redefinition warning for RESET_COLOR
c04fb91b Fix handling of user-defined types in format_to (#793)
323b92bf Force linking of inline functions into the library (#795)
c6d9730d Fix sign conversion warnings (#790)
2e95823e Move new color support to format.h and mark old as deprecated
ab2d88ca Make format_to work with basic_memory_buffer (#776)
3abd036c Fix compilation on gcc 4
c2f38054 Add vformat_to_n (#769)
ce500635 Renamed enum color to colors. Added enum colors conversion to rgb struct. Added colors_test.cpp.
0508bbc7 Add wchar_t overload of format_to_n (#764)
c2fbadb9 Fixed issue #779
47268ecd Fixed GCC version test
9ff3b6af Fix handling of compile-time strings when including ostream.h (#768)
e3707ef1 Document that file should be in wide-oriented mode for wide print
45fa4ee9 Merge branch 'master' of github.com:fmtlib/fmt
9c07b37f Using enum class now. Renamed from hex to color. Changed colr names to snake case.
5b5886a9 Fixed line length.
d2bfee13 Added quotes for strings in ranges and tuple likes.
aff6e45e Added support for rgb color output.
1b8a7f8f Fix postincrement in truncating and counting iterators
4bc26f0a Merge branch 'master' of github.com:fmtlib/fmt
fc6e0fe9 Fix FP formatting to a non-back_insert_iterator with sign & numeric alignment (#756)
cd5b5670 Make is_range and is_tuple_like public API, fix #751
6322b47e Minor cleanup
691a7a91 Add more compilers to CI and increase FMT_PEDANTIC warning levels (#736)
dd1a5ef7 Let requests close the file
d5c46259 Fix formatting of more than 15 named arguments (#754)
47d147b6 Simplify the nvcc warning fix
911a7511 Fix nvcc warnings (#752)
94b47628 Fix docs
252f11f8 Fix a bogus MSVC warning about unreachable code, take 2
81d56638 Fix more bogus MSVC warnings about unreachable code (#748)
68f0ac82 Fix a bogus MSVC warning about unreachable code
b60a5c5d Improve floating-point formatting
8dc2360b Fix a comment
4e4b8570 Implement simple version of Grisu
40275579 Fix tests on 64-bit MSVC
5c32aa41 Workaround a bug in MSVC
468c243c Add a function to get cached power of 10
2f257b72 Implement normalization and simplify power table
6a5bb6e2 Move Android.mk to support and update
e282d963 Bump version
e2cd521b Fix incorrect call to on_align in '{:}=' (#750)
fba352a9 Don't use UDL templates on Intel C++ compiler (#742)
6dcc526d Update release script
5386f1df Update version
ba6640b2 Fix formatting
507a50c3 Fix changelog
147807c9 Detect integer_sequence support on MSVC
8b246531 Update changelog
5ad54256 Fix a conflict between fmt::join and fmt/ostream.h (#744)
6ebc1a96 Merge locale.h into format-inl.h
6966db1d Update docs
2196025d Fix a warning
589f5f37 Update changelog
edd5f144 Fix compilation errors on gcc 4.4
936aba5f Fix compilation errors on gcc 4.4
3e3a2774 Update changelog
b76bb796 Improve naming consistency
fbd51534 Update changelog
69823bf8 Improve naming consistency
d940fa67 Disable unsafe implicit conversion to std::string (#729)
d2bf93fe Update changelog
550ef1d2 MSVC improvements and data truncation cleanup.
728e4f5a Fix docs
8c255771 Update docs and changelog
a68fd44e Add ranges.h to FMT_HEADERS in CMakeLists.txt (#738)
e3f7f3a2 Add support for ranges, containers and tuple-like types in fmt/ranges.h
984232db Remove duplicate ChangeLog entries
78677e3f Update ChangeLog and docs
ad23270e Document to_wstring
3c0f8c26 Update ChangeLog
98937893 Detect inline namespaces on gcc
dfb65469 Fix docs
3aa29115 Update ChangeLog.rst
d3f6c841 Update ChangeLog.rst
c1441ae4 Update ChangeLog.rst
dece85b3 Fix docs, take 2
6a1df3bd Fix docs
838400d2 Add inline namespace fmt::v5
b64b24eb Update ChangeLog.rst
fc908711 Update ChangeLog.rst
46c374a8 Fix compilation with new gcc and -std=c++11 (#734)
f0ae7257 Clarify the use of allocators
d72d0462 Update paths in fmt.pro
edbbf7ce Fix FreeBSD 12
a4e4f745 Fix a -Wundef when FMT_GCC_VERSION < 600
7d3de497 Implement double to fp conversion
a4c7d99f Add bit_cast
0adccaef Fix a -Wundef of _LIBCPP_VERSION
2570f1af Provide more overloads for the wide string flavour
ca31ca13 Fixed arg_formatter_base::write_pointer to not mutate the format specs.
6cd66610 remove trailing spaces.
fe19c266 Move format_string to fmt namespace for ADL
2768af23 Add cached powers of 10
dd296e1d Add a script to compute powers of 10
0efc8a18 Fix compiler warning about narrowing
df1ba52b Update example
221b08fd Merge branch 'master' of github.com:fmtlib/fmt
fa9066fe context_base::begin -> out
90ff31b3 Fix a -Wundef warning on clang
b1f68c43 Merge branch 'master' of github.com:fmtlib/fmt
cd90097c Implement handmade FP
822eccc3 Sync API with standards proposal
2ae41242 allow time formatting with wchar_t contexts
a1579b0f Update key
ded921f0 Fix documentation build, take 2
3284751f Fix documentation build
bb738c4c Remove section on Write API since it's being superceeded by compile-time Format API
d180c25c Update godbolt link
1ed842a3 Update godbolt link
e80aba1c Remove format_float stub
7b8cb313 Make context_base::args() public
48ae0506 fixes MSVC compiler warning bloat (Visual Studio 2017, latest updates)
096c4051 Simplify char_traits
7610c536 Remove unused macro
111fa581 Update README.rst
52fcef1e Update docs
7d28674d make_args -> make_format_args
9382b76f context_t -> format_context_t
fd0b07a7 (w)context -> (w)format_context
26aa34f3 basic_context -> basic_format_context
44cc0346 Relax string_view requirements
0829cab8 Remove from_checked
cb7bbc62 Improve checked iterator support
5079f924 Fix a narrowing warning
5859e58b Fix msvc warnings
1e747f60 Fix msvc warnings
9d4efd7a Iterator Wars VI: Return of the checked iterator
9764f558 Update docs
4ef97b9b Add a missing comma
23759b26 basic_arg -> basic_format_arg, arg_store -> format_arg_store
4975297e Simplify counting iterators
e8e006f4 Fix compile checks for mixing narrow and wide strings (#690)
c5ebecf7 Document format_to_n
3cf05263 Return output iterator to the end from format_to_n
174087bf Implement format_to_n
050f3f1f Remove parts of obsolete write API
e90b1da3 Fix linker errors using fmt as shared library in MSVC
8e10d404 Fix compile tests
7a41d61d Add make_printf_args
4fea018b Fix string_view detection
6957d28c Detect string_view on libc++ (#686)
0ea70def Update readme
9ce5e30c Update readme
8c29459e Fix handling of empty string_view (#689)
a24005d5 Fix a narrowing warning
3651b7fc Fix a narrowing warning
b64486da Add format.cc
3da71d51 Move source files to the src directory
7971ed3d Update readme
f61ca2ec Update readme
84e520b7 Update readme
e8aa0f33 Update docs
17258e9c Update docs
6d339e32 Improve comment
c3d05245 Fix a shadowing warning
b58c8dde Update docs
505b3ae6 Workaround GCC bug 67371 (#682)
70dffc63 Remove unnecessary check
df828f88 Don't define FMT_GCC_VERSION on clang
42f70c8b Avoid narrowing casts
10b939b0 Remove unneeded usage of anonymous struct on clang
3adfaae2 Remove extra semicolon in format_args constructor
40066785 Fix warnings under MSVC (#679)
9c5f54a7 Add format example for padded hex byte
7bab90e5 Remove extra comma
2e21e7d1 Fix util-test
acb469ae Fixed UTF8/16 converters to support empty string input
c37c4c43 Fix find-package-test
6d21fc43 add alias targets with fmt namespace
e02aacc6 Add CMake namespace (#511)
aee4512c Gradle (#649)
7db0e94b Fix handling of numeric alignment with no width (#675)
9facc119 Update docs
a1d18711 Merge branch 'master' of github.com:fmtlib/fmt
daf650c4 Disallow formatting of multibyte strings into a wide buffer (#606)
8fd7e30f Update README.rst
ca93be13 Use fmt(s) as an alias for FMT_STRING(s)
80e57c7a Update to new naming conventions
ae3cc844 Check format string at compile time in print
585512fc Remove unnecessary instantiations
7755cdc1 Make symbols readable
f867d082 Update docs
a103b9bc Workaround missed optimization in gcc (#668)
bb47109a Cleanup
f1ede638 Make inline_buffer_size public and update docs
995b63ad Update copyright
40232917 Update docs
86a9bc82 Cleanup
b7632e96 Make format_to return iterator and update docs
5281ea6a do_vformat_to -> vformat_to and update docs
d07ba498 Fix docs
418659ad Fix compilation errors on gcc 4.4
1d2adef2 Fix compilation errors on gcc 4.4
45518c3f Fix compilation errors on gcc 4.4
698d9097 Workaround a bug in gcc 5.1
81074c70 Fix more compilation errors on gcc 4.6
1b452538 Fix more compilation errors on gcc 4.6
6090e51b Fix compilation errors on gcc 4.6
0827ec5a Fix compilation errors on gcc 4.6
4d35f941 Always use fallback string_view to pass format string (#664)
34cf54c2 Update README.rst
0565d654 Fix gcc 7.2 issue
f5dc0ed3 Break long lines
ea06f021 test: comment out one FormatStringErrors constexpr test
5b491773 test: Initialize some local variables
f45f70af Use trailing return type instead of deduction
db86e8d5 Remove a couple of unused argument names
55f5c9f2 Use FMT_NULL instead of 0 is a few more places.
e92ba107 Fix Python str.format link to point to Python 3 docs
a7ae5666 Enable join on msvc
24d249b0 Fix formatting of objects convertible to string_view
e508e308 Don't define FMT_LOCALE on OpenBSD
0ee4273b Put is_enum check first not to instantiate convert_to_int unnecessarily
8ca3ab2c Revert problematic pragma
18ac9870 Fix formatting of objects convertible to std::string
ce4a65ff Add pointer support to basic_writer
91721caa Add detection of wostream operator<< (#650)
1efc15c1 Fix MSVC build
8ed264fc Rename type enum constants to prevent collision with poorly written C libs (#644)
4ba3f7db Update docs
7d2723d5 posix.cc: Fix compilation with -fno-exceptions
24d66c5d compilation fix & warnings
229887bd Make constexpr remove_prefix gcc version check tighter (#648)
f3f19e76 Update docs
e9fa42ac Fix docs and build issues on gcc-4.6
affb35cf Replace using with typedef for compatibility with gcc-4.6
9710c058 Update documentation building script
1a4e8927 Move output_range to format.h
522de7b5 Replace using with typedef for compatibility with gcc-4.6
0b508fd2 Fix c++0x detection
1849735f Fallback to c++11 if c++14 not available
3239c518 Get rid of generic lambdas
78166ccd Get rid of generic lambdas
d8ef8a9e Cleanup
82222218 Update README.rst
b0005324 Merge the std branch
a502decd Added a fmt.pro to support build using qmake (#641)
61065e1a Fix unreachable code warning when signbit returns bool
403ae0a2 Add debug postfix for libfmt (#636)
5096c0fe Fix string_view detection
5b3f9eab Update syntax.rst
e802cf14 Add note about errno to the documentation
c96d6465 CMakeLists: Use GNUInstallDirs to set install location
dbd84697 Update usage.rst
5013c157 Silence MSVC 2017 constant if expression warning
cdfcee27 Use allocator_traits if available
66b25ef0 Add examples
6cb68f94 Fix warnings
0b635c9d Fix handling of fixed enums in clang (#580)
66afd9b3 Fix compilation on gcc 6
67e070fe Make format work with C++17 std::string_view (#571)
867b3309 Remove ANDROID macro check per comment in #458
64599973 Enable stream exceptions (#581)
35f8f036 Use less version 2.6.1 and sudo to fix npm install issues on travis
92a250fd Suppress Clang's warning on zero as a null pointer
2f13d41e Add to_wstring
1e19ae83 Workaround a bug in MSVC
3810d7e4 Workaround a bug in MSVC
5c7474e1 Relax constexpr requirements
1f57243b Relax constexpr requirements
dc540361 Conditionally compile constexpr
5d8ba816 Fix a segfault in test on glibc 2.26 #551
a9f810c1 Update README.rst
2582f41e Fix ifdefs
1a7d0ba2 Adding OpenSpace to the list of projects
8921f613 Update build script
f62e225e Automatically update version in release script (#431)
94806747 remove 'FMT_CPPFORMAT' CMake option
bfce29ff Improve conversion
8cf30aa2 Fix segfault on complex pointer formatting (#642)
f164e4c7 Remove old bcc-related comments
c57029c1 Add Drake & Lyft Envoy to the list of projects
8fa9acb8 Workaround broken __builtin_clz in clang with MS codegen (#519)
3dae2582 Describe cmake use of header-only target
1c7b751d Fix handling of implicit conversion to integral types larger than int
08dff377 Allow compiling and using as DLL in windows #502
c753a2af Don't include the world with WIN32_LEAN_AND_MEAN (#503)
a5185ec8 add SOURCELINK_SUFFIX for compatibility with Sphinx 1.5
768061c8 Fix FormatBuf implementation (#491)
0c136381 Move back_insert_range to format.h
5060568f %.f should have zero precision, not default precision
a09f7488 Add Kodi (xbmc) to the list of projects using fmt
f9fa7c40 Add FMT_API and FMT_OVERRIDE where needed
a980d3b4 Add fmt::join to format ranges (#466)
87eab90e Fix missing intrinsic when included from C++/CLI (#457)
75005bbc Don't export the -std=c++11 flag from the fmt target
19f990a9 Use https to fetch dependencies from github
bca9de9e Return iterator from format_to
0555cea5 Added a fmt.pro to support build using qmake (#641)
a93270fd Replace a bunch of craft with type_traits, take 2
21429c86 Revert "Replace a bunch of craft with type_traits"
0473c48f Add std::basic_string allocator support (#441)
72d9fffd Fix test compilation for FreeBSD (#433)
e79588d6 Replace a bunch of craft with type_traits
3a6c7d0c Fix signbit detection (#423)
5e4c34b2 Add version macro FMT_VERSION (#411)
bd8a7e7e More iteratification
f78c3e41 Fix unreachable code warning when signbit returns bool
0a402056 Add CONTRIBUTING.rst
e35d41ff Add extern templates for format_float (#413)
d8c25a17 Use nullptr if available
e95e4659 Add syntax.rst to build
e5111950 argument index -> argument id
229ee34e Fix compiler warnings
7fe0f3da Update ChangeLog
38b603a4 Update README.rst
a1e7e4a7 Fix compilation with -fno-exceptions (#402, #405)
3f24a388 Thread-safe time formatting (#396)
f853d94a Remove unnecessary fmt/ prefix (#397)
9649919d Document use of format_arg for user-defined type #393
c8efe145 Add api.rst to build
da80005f Fix compilation on Cygwin (#388)
8ed16353 Fix a typo
1760c31b Workaround Doxygen mess
72606f23 Add missing types to counting_iterator
c1571003 Add debug postfix for libfmt (#636)
6822466a Handle nested braces in join (#638)
64b349ae More iterator support & fmt::count
e3b69efb Suppress msvc warnings in gmock
322736d3 Add support for arbitrary output iterators
10291194 Cleanup
c1d137ed Add support for nonconiguous iterators
f6fd38bb More iterator support
c2fecb9b Clean API
9a53a706 Add support for back_insert_iterator
91ee9c9a Return iterator from the format method
67928eae Don't inherit context from parse_context
217e7c76 Pass ranges by value
22994c62 Decouple arg_formatter_base from buffer
00f1450d Update tesmplate parameter names
3a2e89e1 Reduce dependency on buffer
c719d944 Fix experimental/string_view detection
cea3c207 Give a better error message for function pointers (#633)
232ceabb Workaround an internal compiler error in MSVC
c0954453 Replace buffer with range
c3d6c5fc Replace buffer with range
0f987731 add transition helper to format.h
d165d9c4 Decouple locale and buffer
36634140 Parameterize basic_writer on buffer type
6f2769d0 Revert "Added support for format string containing '\0' in _format udl (#619) (#620)"
5f1c73db Shorten a comment in locale.h
31934602 Update version
51a16f8c Update ChangeLog.rst
a0087460 Merge release branch
941663d0 Merge ostream.cc into ostream.h
955062da Merge printf.cc into printf.h
5705bf1c Added support for pre-c++17 experimental string_view (#607)
cabce31f Update syntax.rst
ccaae0c0 Refer to jeaiii project
e3715102 Add a integer formatter based on jeaiii
b3495f2e Update README.rst
61f296e3 Move FMT_HAS_BUILTIN to format.h
ce801c90 Remove dependency on <vector> and <array>
41fc2990 Merge branch 'std' of github.com:fmtlib/fmt into std
971fb584 Allow mixing named and automatic arguments
af0f21da add missing inline in header-only mode (#626)
7cea1638 numeric -> arithmetic
5328907f Get rid of <limits> dependency
faaafc7e Remove <utility> dependency and replace typedefs with using
94edb1a7 Add a lightweight header for the core API
3aaa25fa Added support for format string containing '\0' in _format udl (#619) (#620)
84bd2f19 Merge include/fmt/CMakeLists.txt into the main CMake file
7f351dec Decouple <locale> for better compile times
81bd9e8e args -> format_args
10e70a06 Improve handling of custom arguments
e0243000 arg_index -> arg_id
ac5f9520 Automatically add package to release
0e914372 Avoid conflict with the macro CHAR_WIDTH
f03a35a6 Check string specs at compile time
e9da5741 Check char specs at compile time
b25a0292 Check pointer type specs are compile time
c8a9d902 Check floating-point type specifiers
6570dc31 Disallow formatting of multibyte strings into a wide buffer (#606)
3851994a Fix yet another internal compiler error in MSVC
44e18651 Refactor parse context and fix warnings
e7e270f5 Test error on invalid type spec and remove unused alias
692b82d3 UdlArg -> udl_arg
c523dd58 Use error handler to report errors
5a32e64b More tests
093e2a47 Improve error handling
dc104cba Workaround internal compiler errors in MSVC
39411504 More tests
e3eb5ea0 Add parse_context::error_handler()
734e722d Fix warnings
62af25dc Workaround yet another MSVC internal error
594bd8fe More tests
f2b52bba More tests
dfdb1ade More tests
7967c2f8 Disable test that triggers an MSVC bug
18a0b94b Fix overflow check
686ff942 Fix compile-time parsing and add more tests
5b95b5d7 Test compile-time errors
246bdafc Add FMT_STRING macro for compile-time strings
e8055433 Remove FMT_USE_VARIADIC_TEMPLATES
dba1ccc4 Update readme
e613b3c7 Update readme
9fda7a36 Check integral type specs at compile time
92847a0d Add integral type handler
a03842b0 More compile-time checks
1c855a47 Integrate constexpr format specs parsing
780b44bf Add compile-time format string check
8ca6e76d Detect user-defined literal templates
a7e98616 Workaround another MSVC madness
db9ffa14 Make parse_format_string constexpr
e926ae78 Add parse_format_string
57e266ab Rename handlers
d29c7c3a Workaround a bug in MSVC
aadb38a5 Make specs_checker constexpr
dd0b72e1 Remove refactoring artefacts
e52b10e3 Merge branch 'vitaut-patch-1' of github.com:fmtlib/fmt into std
529d88ce Make dynamic_format_specs construction constexpr
d2f2a8b0 constexpr support of dynamic width and precision
6b3840b7 Make format_specs construction constexpr
a38bd9ca Fix formatting and naming
91014f01 Naming conventions
932ab2bf Report error from parse_nonnegative_int via handler
0ebdf41e Fix compile-test
170f5c67 Move headers to include/fmt
3d11eac7 Workaround another MSVC constexpr bug
c69e3086 Update README.rst
25aac0be Fix travis build on macOS
b83241ff Make format spec parsing constexpr
bd5188c8 Remove MinGW because it's not on appveyor image
62616b88 Workaround a bug in MSVC's constexpr handling
b8f85f67 Use Visual Studio 2017 image on appveyor
7174de0d Fix contexpr-ness of pointer_from
3785afc5 Pass errors to handler instead of throwing (#566)
1b5ccf6c Make parse_arg_id constexpr
17f93fe0 Make basic_string_view ctors constexpr
d5e918b6 Detect C++14 compiler support
be5b4552 Make null_terminating_iterator more iteratory
643fb066 Check for argument indexing switch
d45544d1 Fix width handling in dynamic formatting
8cbf5447 Add parse context
ec4f5175 Replace Range with ParseContext in parse()
83dd2ab9 Simplify dynamic_specs_handler
5a8ae0bb Fix a warning
39bc319b Update test results
534bff7d Fix handling of max packed arguments
0cda806d Fix compile tests
a3191a99 Get rid of FMT_MAKE_WSTR_VALUE macro
fced79b0 Get rid of old compat macros
be887d92 Replace internal::get with std::declval
53cf0735 Get rid of FMT_MAKE_VALUE macro
2972de4b Char -> char_type
9ee7c216 Type -> type
1a09194a Cleanup type handling
c18a4041 Remove conditional and to_iterator
1cade7ef Remove FMT_USE_RVALUE_REFERENCES
7413239f Remove unnecessary qualification
af00e4f9 Remove printf_arg_formatter from format.h and cleanup
44a26e5e CharPtr -> pointer_type and move to writer
0fbd8465 Replace fmt::internal::make_unsigned with std::make_unsigned
8a2bc0ab Add nullptr support
80505995 Allow delayed type checking
b0867f3f AlignSpec -> align_spec and fix a warning
f194a418 Replace fmt::is_same with std::is_same
47c84d79 Move part of write API (spec factories) to a separate header
20168147 Add ptr, a helper function for pointer formatting
77c892c8 Fix more warnings
be7d72ba Fix expansion-to-defined warning
d4c504ae Fix a warning
27ad6cee Use standard enable_if
64681739 Fix a warning
38806167 Remove FMT_HAS_GXX_CXX11
a7320bdc Fix a warning
016acebb Remove legacy code
07f8ffc4 Suppress shadowing warnings
466386d5 Suppress a warning in gmock
70ef82a8 Workaround a bug in MSVC
5e0562ab Separate parsing and formatting
1102d465 Make format spec parsing context-independent
45911770 Separate parsing and formatting in extension API
7bd776e7 Explain why null_terminating_iterator is used
873c8451 Remove system_header pragma
9f7957c0 Separate argument parsing and formatting
da439f28 Suppress warning about missing noreturn attribute (#549)
eefdb379 Fix an unused argument warning
2f4f49fd Switch from cstring_view to string_view
a8d6f309 Minor optimizations
d16582a0 Move printf-related code to printf.cc
361911dd Use preinstalled version of cmake on travis
9ea183aa Fix MSVC build
8f4b918c Check argument index
4193485b Remove test files
07123e8f Use Ubuntu Trusty on Travis for a new CMake
586d6363 Implement more efficient handling of large number of format arguments
12252152 CStringRef -> cstring_view
5aa8d6ea Return locale by value
32ec13f1 Switch to C++ locale
b4f4b7e2 Clean the buffer API (#477)
f423e468 Replace clear() with resize(0) and data_ -> store_
23b8c24d Add noexcept
7175bd8a Fix error on MinGW
7258d1b8 Fix tests
3610f34c Fix windows build
572491ad Document which header defines formatting functions
c333dca0 Follow standard naming conventions
6a2ff287 Follow standard naming conventions
eedfd07f internal::MemoryBuffer -> basic_memory_buffer
4ec88607 ArgFormatter -> arg_formatter
50e71673 StringRef -> string_view, LongLong -> long_long
e022c21d Fix windows build
87b691d8 Merge StringWriter into StringBuffer
c2f02169 Merge ArrayWriter into FixedBuffer
fefaf07b Pass buffer instead of writer to format_value
6e568f3a buffer -> basic_buffer
bb1c82ef Fix build
a13b96ed Simplify API
624c5868 Simplify API
7ae8bd70 basic_format_arg -> basic_arg, Buffer -> buffer
bf0f1075 Parameterize format_specs on character type
296e9cad FrmatSpec -> format_spec
b5fb8dd1 stream -> buffer
984a1029 Remove IntFormatSpec and StrFormatSpec
4863730e Remove pad
aaa0fc39 Improve compatibility with old compilers and fix test
aea5d3ab Improve compatibility with older gcc and update tests
84850277 Use named argument emulation instead of nested functions
ec15ef7b Replace operator<< with write function
b77c8190 FPUtil -> fputil
8428621d BasicWriter -> basic_writer
939aff29 Remove unnecessary template arg from basic_format_args
f69786a7 Remove Not
b2a0d891 Merge value and MakeValue
acd1811c Value -> value
42a31907 Parameterize Value on context
a4d6cb32 Clean up basic_format_arg
d705d516 Parameterize basic_format_arg on context (#442)
422236af Don't erase writer type
abb6996f MakeArg -> make_arg
ee1651ce Handle empty format_arg state
3bbc5799 Fix MinGW build
63fcfc57 Fix build on older gcc
d86e51e9 Don't inherit basic_format_arg from internal::Value
f0588869 Fix handling of unpacked args (#437)
11836218 Add support for exotic character types
763ca978 Parameterize Value on character type
6cba8fe9 Move stuff out of internal::Value
e1ee5bf0 Replace StringValue with StringRef
0854f8c3 Parameterize formatting argument on char type.
9cf6c8fd Get rid of fmt::internal::Arg
5f022ae0 Remove FMT_DISPATCH
41d4bcf0 Ingore Xcode files
28429701 Merge BasicArgFormatter and ArgFormatter
d4084ac5 Get rid of ArgVisitor
d58cc8a4 Merge BasicPrintfArgFormatter and PrintfArgFormatter
e2dfd39c Update arg visitors
751ff64b Update ArgConverter to the new visitor API
c9dc41ab Replace ArgVisitor::visit with a free visit function
caa60b9c Update comment
95a53e1f Refactor argument visitor API (#422)
6d241167 Improve visitor API
a1dd524b format_arg -> do_format_arg
55a1ac50 Fix test
85793a18 Simplify API
9998f66f Replace formatter with context
2bba4203 Pass writer directly to format_value (#400)
b656a1c1 Make value the second argument to format_value
edf98792 Pass writer to format_value
64ca334a CharType -> Char
be613204 Char -> char_type
f85d5f4d BasicFormatter -> basic_formatter
18dfa257 Pass correct formatters to make_format_args
dafbec75 Fix type safety when using custom formatters (#394)
506435bf Fix formatting
f2879940 Fix formatting
48fe9783 Add format_arg::operator bool
119a63ab internal::Arg -> format_arg
65a8c2c3 format_arg -> format_value
13b04044 Add format_args::size_type
8a77e792 Enable C++11 in tests.
1e8553d6 Enable C++11 in tests.
06bab3ed Workaround mingw bug https://sourceforge.net/p/mingw/bugs/1531/
6fd6ecc1 Enable C++11 for no-windows-h-test
c4212f9e format -> vformat
21c6700b Don't build std branch with -std=c++0=98
209a1d58 Get rid of macros
9a079732 Test types
ea28a637 Get rid of FMT_VARIADIC_CTOR
0d8aca8d Get rid of FMT_VARIADIC_VOID
4ece95a7 Make make_format_args public
0028ce57 Get rid of FMT_VARIADIC
ece7ae5f Make format_arg_store convertible to format_args
621447fe Make initialization C++11-compatible
a0190e4b Add a missing include
b903f5c1 format -> vformat
43c0095a Refactor type mapping
4873685c ArgArray -> format_arg_store
fc73e106 ArgList -> format_args
92605eb4 Remove FMT_USE_VARIADIC_TEMPLATES
9bb213e9 FormatError -> format_error
REVERT: 135ab5cf Update version
REVERT: 93d95f17 Fix markup
REVERT: 4f15c72f Fix markup
REVERT: e9b19414 Automatically add package to release
REVERT: c3d1f604 Fix markup
REVERT: c96062bf Update changelog and version number

git-subtree-dir: externals/fmt
git-subtree-split: 3e75ad9822980e41bc591938f26548f24eb88907
2020-04-22 20:57:22 +01:00
Merry
c649f11c0a Merge pull request #401 from lioncash/folding
constant_propagation_pass: Fold &, |, ^, and ~ operations where applicable
2020-04-22 20:56:01 +01:00
MerryMage
2524d536b0 A32/ir_emitter: Bugfix: ExceptionRaised was producing incorrect PC
Use actual PC and not pipelined PC.
2020-04-22 20:56:01 +01:00
Lioncash
c09f4cf28e constant_propagation_pass: Fold NOT operations 2020-04-22 20:55:50 +01:00
Lioncash
d69fceec55 value: Move ImmediateToU64() to be a part of Value's interface
This'll make it slightly nicer to do basic constant folding for 32-bit
and 64-bit variants of the same IR opcode type. By that, I mean it's
possible to inspect immediate values without a bunch of conditional
checks beforehand to verify that it's possible to call GetU32() or
GetU64, etc.
2020-04-22 20:55:50 +01:00
Lioncash
8013548bbb constant_propagation_pass: Fold OR operations 2020-04-22 20:55:50 +01:00
MerryMage
ca603c1215 reg_alloc: Emit AVX instructions where able
Smaller codesize.
2020-04-22 20:55:50 +01:00
Lioncash
898d096e39 constant_propagation_pass: Fold AND operations 2020-04-22 20:55:50 +01:00
MerryMage
e2358af5ef abi: Emit AVX instructions where able
Smaller codesize.
2020-04-22 20:55:50 +01:00
Lioncash
f40fcda1f6 ir/value: Add member function to check whether or not all bits of a contained value are set
This is useful when we wish to know if a contained value is something
like 0xFFFFFFFF, as this helps perform constant folding. For example the
operation: x & 0xFFFFFFFF can be folded to just x in the 32-bit case.
2020-04-22 20:55:50 +01:00
MerryMage
7c0378f56d a64_exclusive_monitor: Loosen memory ordering requirements
It is not necessary to be as strict as it was.
2020-04-22 20:55:50 +01:00
Lioncash
0ea99b7d59 constant_propagation_pass: Fold EOR operations
It's possible to fold cases of exclusive OR operations if they can be
known to be an identity operation, or if both operands happen to be known
immediates, in which case we can just store the result of the
exclusive-OR directly.
2020-04-22 20:55:50 +01:00
MerryMage
f0920c0ded Fix VShift terminology
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.

- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2020-04-22 20:55:50 +01:00
MerryMage
b51dae790d emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS16 2020-04-22 20:55:50 +01:00
MerryMage
bd47f2ca8f emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS64 2020-04-22 20:55:50 +01:00
MerryMage
3bf183d7e8 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftS32 2020-04-22 20:55:50 +01:00
MerryMage
94f9d402eb emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftU16() 2020-04-22 20:55:50 +01:00
MerryMage
6d9639e3b0 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU64() 2020-04-22 20:55:50 +01:00
MerryMage
bbc066a266 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU32() 2020-04-22 20:55:50 +01:00
Lioncash
da2e7fad87 emit_x64_vector: SSSE3 variant of EmitVectorCountLeadingZeros8()
pshufb lyfe
2020-04-22 20:55:50 +01:00
VelocityRa
c30b8dbe99 decoders: Cast to correctly-sized type before shifting
Fixes decoding for 64-bit instructions

Does not help/apply to any currently supported ARM versions (since
all are 32-bit length or below), it's for future-proofing should
such an arch be supported.
2020-04-22 20:55:50 +01:00
MerryMage
238f2f2cd0 a64_emit_x64: Lowercase PAGE_SIZE
PAGE_SIZE is defined as a macro by musl.
2020-04-22 20:55:50 +01:00
MerryMage
7162f6f254 emit_x64_vector_floating_point: SSE4.1 implementation of EmitFPVectorToFixed 2020-04-22 20:55:50 +01:00
MerryMage
e7a5592699 emit_x64_vector_floating_point: EmitFPVectorRoundInt: Use FCODE 2020-04-22 20:55:50 +01:00
MerryMage
b8fde48732 emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8 2020-04-22 20:55:50 +01:00
MerryMage
fd37b637aa emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16 2020-04-22 20:55:50 +01:00
MerryMage
09bf273bc8 A64: Implement SCVTF, UCVTF (vector, fixed-point), scalar variant 2020-04-22 20:55:06 +01:00
MerryMage
03ad2072a7 emit_x64_floating_point: Reduce fallback LUT code in EmitFPToFixed 2020-04-22 20:55:06 +01:00
MerryMage
f9129db6fd A64: Implement FCVTZS, FCVTZU, UCVTF, SCVTF (vector, fixed-point), vector variant 2020-04-22 20:55:06 +01:00
Lioncash
48df9b9a7d A64: Implement UQSHL's vector immediate and register variants 2020-04-22 20:55:06 +01:00
Lioncash
d426dfe942 ir: Add opcodes for unsigned saturating left shifts 2020-04-22 20:55:06 +01:00
Lioncash
ab60720418 A64/translate/impl: Make signatures consistent for unimplemented by-element SIMD variants
Makes them all consistent, so it isn't necessary to change the
prototypes over when implementing them.
2020-04-22 20:55:06 +01:00
Lioncash
6b5ea6ee66 A64: Implement BRK
Currently, we can just implement this as part of the exception
interface, similar to how it's done for the A32 interface with BKPT.
2020-04-22 20:55:06 +01:00
Lioncash
b915364c16 A64/imm: Add full range of comparison operators to Imm template
Makes the comparison interface consistent by providing all of the
relevant members. This also modifies the comparison operators to take
the Imm instance by value, as it's really only a u32 under the covers,
and it's cheaper to shuffle around a u32 than a 64-bit pointer address.
2020-04-22 20:55:06 +01:00
MerryMage
02150bc0b7 IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed 2020-04-22 20:55:06 +01:00
MerryMage
027b0ef725 A64: Implement SCVTF, UCVTF (scalar, fixed-point) 2020-04-22 20:55:06 +01:00
MerryMage
8051f60db0 opcodes.inc: Align columns to a tabstop of 4 2020-04-22 20:55:06 +01:00
MerryMage
90193b0e3d IR: Add fbits argument to FixedToFP-related opcodes 2020-04-22 20:55:06 +01:00
Lioncash
616a153c16 A64: Implement SQSHL's vector immediate variant 2020-04-22 20:55:06 +01:00
Lioncash
e8b0f25dff A64: Implement SQSHL's vector register variant 2020-04-22 20:55:06 +01:00
Lioncash
b14eaaec46 ir: Add opcodes for left signed saturated shifts 2020-04-22 20:55:06 +01:00
Lioncash
da55ed7b31 branch: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash
867b666285 move_wide: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash
78024a9dc4 load_store_register_unprivileged: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash
e45e5da610 load_store_register_immediate: Place conditional bodies on their own line
Makes the conditionals visually consistent with the rest of the
codebase.
2020-04-22 20:55:06 +01:00
Lioncash
b586cf3f56 load_store_load_literal: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash
c3a3b9687e data_processing_logical: Move datasize declarations after early-exit conditionals
While we're at it, make variables const where applicable.
2020-04-22 20:55:06 +01:00
Lioncash
ed797e6540 data_processing_conditional_select: Make variables const where applicable
Makes CSEL's function consistent with all of the others.
2020-04-22 20:55:06 +01:00
Lioncash
c82fa5ec5a data_processing_addsub: Move datasize declarations after early-exit conditionals
While we're at it, also make relevant variables const where applicable
2020-04-22 20:55:06 +01:00
Lioncash
f4a66d2477 data_processing_bitfield: Move datasize variables after early-exit conditionals
Moves the declaration of datasize to the scope that it's used within.
This also takes the opportunity to apply const where applicable, and
make early-exits all vertically consistent with one another.
2020-04-22 20:55:06 +01:00
Lioncash
2e0fcd6161 A64: Implement CLS's vector variant
Leverages CLZ like the integral variant does.
2020-04-22 20:55:06 +01:00
Lioncash
a2cd643525 emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked
Given this is just an internal helper function, it can be marked static.
2020-04-22 20:55:06 +01:00
Lioncash
c39ea2e3c9 perf_map: Use std::string_view instead of std::string for PerfMapRegister()
We can just use a non-owning view into a string in this case instead of
potentially allocating a std::string instance.
2020-04-22 20:55:06 +01:00
MerryMage
12243692f5 A64: Implement SQRDMULH (vector), vector variant 2020-04-22 20:55:06 +01:00
MerryMage
a9ffcf08b1 A64: Implement SQDMULL (vector), vector variant 2020-04-22 20:55:06 +01:00
MerryMage
3e447614c6 IR: Add VectorSignedSaturatedDoublingMultiplyLong 2020-04-22 20:55:06 +01:00
MerryMage
06b31448aa emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
MerryMage
08c0e017a5 IR: Implement Vector{Signed,Unsigned}Multiply{16,32} 2020-04-22 20:55:06 +01:00
Lioncash
b6df34cdde backend_x64/a64_interface: Re-enable the constant folding pass
This was disabled for debugging, but never re-enabled. Just to be sure,
testing was done downstream in yuzu to make sure this didn't happen to
break anything (which seems to be the case).
2020-04-22 20:55:06 +01:00
MerryMage
06ba397af2 emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused 2020-04-22 20:55:06 +01:00
MerryMage
e553c4fe8d emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused 2020-04-22 20:55:06 +01:00
MerryMage
3caeb62ef1 emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused 2020-04-22 20:55:06 +01:00
MerryMage
344ee76aba emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64} 2020-04-22 20:55:06 +01:00
MerryMage
1492573267 emit_x64_vector: SSE implementation of VectorSignedSaturatedAccumulateUnsigned{8,16,32} 2020-04-22 20:55:06 +01:00
Lioncash
26df6e5e7b emit_x64_vector: Correct static asserts for < 64-bit type checks in saturated accumulate fallbacks
I had initially meant to use BitSize() here, not sizeof()
2020-04-22 20:55:06 +01:00
MerryMage
a4a26ac226 emit_x64_vector: EmitVectorSignedSaturatedAccumulateUnsigned64: SSE implementation 2020-04-22 20:55:06 +01:00
MerryMage
a7c66d2d28 emit_x64_vector: Simplify fpsr_qc related code
Move the bool conversion into A64JitState::GetFpsr so we don't have to continuously
pay the cost of conversion for every saturation instruction.
2020-04-22 20:55:06 +01:00
Lioncash
112cff9ab9 A64: Implement CLZ's vector variant 2020-04-22 20:55:06 +01:00
Lioncash
e739624296 ir: Add opcodes for vector CLZ operations
We can optimize these cases further for with the use of a fair bit of
shuffling via pshufb and the use of masks, but given the uncommon use of
this instruction, I wouldn't consider it to be beneficial in terms of
amount of code to be worth it over a simple manageable naive solution
like this.

If we ever do hit a case where vectorized CLZ happens to be a
bottleneck, then we can revisit this. At least with AVX-512CD, this can
be done with a single instruction for the 32-bit word case.
2020-04-22 20:55:05 +01:00
MerryMage
d4c37a68a8 A64/translate: VectorZeroUpper for V(64) stores
Ensures correctness.
2020-04-22 20:55:05 +01:00
MerryMage
b8daa4feac simd_two_register_misc: FNEG (vector) with Q == 0 had dirty upper 2020-04-22 20:55:05 +01:00
Lioncash
5653e7637e emit_x64_vector: Remove unnecessary [[maybe_unused]] attributes
These were unintentionally left in when introducing SUQADD and USQADD
2020-04-22 20:55:05 +01:00
Lioncash
14e026a7f0 A64: Implement USQADD's scalar and vector variants 2020-04-22 20:55:05 +01:00
Lioncash
d4a76aaa04 ir: Add opcodes form unsigned saturated accumulations of signed values 2020-04-22 20:55:05 +01:00
Lioncash
18ad7f237d A64: Implement SUQADD's scalar and vector variants 2020-04-22 20:55:05 +01:00
Lioncash
6f911a26da ir: Add opcodes for signed saturated accumulations of unsigned values 2020-04-22 20:55:05 +01:00
Lioncash
9a3d38d2ee A64: Implement SMLAL{2}, SMLSL{2}, UMLAL{2}, and UMLSL{2}'s vector by-element variants
We can simply modify the general function made for SMULL{2} and
UMULL{2}'s by-element variants to also handle the other multiply-based
by-element variants.
2020-04-22 20:55:05 +01:00
Lioncash
6ccfbc9b39 A64: Implement UMULL{2}'s vector by-element variant 2020-04-22 20:55:05 +01:00
Lioncash
58e21f175c A64: Implement SMULL{2}'s vector by-element variant 2020-04-22 20:55:05 +01:00
Lioncash
134bb02e19 ir/value: Replace includes with forward declarations
enum classes are still considered complete types when forward declared
(as the compiler knows the exact size of the type from the declaration
alone). The only difference in this case being that the members of the
enum class aren't visible. Given we don't use the members within this
header in any way, we can simply forward declare them here and remove
the inclusions.
2020-04-22 20:55:05 +01:00
Lioncash
2c8e07e7d0 ir/cond: Migrate to C++17 nested namespace specifiers 2020-04-22 20:55:05 +01:00
Lioncash
c3b7819a55 CMakeLists: Add missing cond.h header to file listing
Allows the file to show up within IDEs more easily.
2020-04-22 20:55:05 +01:00
Lioncash
0a3976059f A64: Implement URSQRTE 2020-04-22 20:55:05 +01:00
Lioncash
b6e74fd17d ir: Add opcodes for performing unsigned reciprocal square root estimates 2020-04-22 20:55:05 +01:00
Lioncash
bd3582e811 A64: Implement URECPE 2020-04-22 20:55:05 +01:00
Lioncash
af83360f89 ir: Add opcodes for unsigned reciprocal estimate 2020-04-22 20:55:05 +01:00
Lioncash
740ffa52ae A64: Implement SQNEG's scalar and vector variant 2020-04-22 20:53:46 +01:00
Lioncash
fca7eddb9e A64: Add opcodes for signed saturating negations 2020-04-22 20:53:46 +01:00
Lioncash
f1ebbcd7bc emit_x64_vector: Simplify "position == 0" case for EmitVectorExtract()
In the event position is zero, we can just treat it as a NOP, given
there's no need to move the data.
2020-04-22 20:53:46 +01:00
Lioncash
87372917f9 emit_x64_vector: Simplify "position == 0" case for EmitVectorExtractLower()
In the event position == 0, we can just treat it as a simple movq,
clearing the upper half of the XMM register. This also makes that case
use only one register.
2020-04-22 20:53:46 +01:00
Lioncash
f5fb496e7e A64: Implement SQDMULH's by-element scalar variant 2020-04-22 20:53:46 +01:00
Lioncash
40f0576995 A64: Implement SQDMULH's by-element vector variant 2020-04-22 20:53:46 +01:00
MerryMage
8f9206901d backend/x64: Do not clear fast_dispatch_table if not enabled
There is no need to pay for the cost of setting a large block of memory if we're not using it.
2020-04-22 20:53:46 +01:00
MerryMage
9b65100660 A64: Implement FastDispatchHint 2020-04-22 20:53:46 +01:00
MerryMage
f96c43d422 A32: Implement FastDispatchHint 2020-04-22 20:53:46 +01:00
MerryMage
aa8d826c13 ir/terminal: Add FastDispatchHint 2020-04-22 20:53:46 +01:00
Lioncash
1a69a61cb4 A64: Implement SQDMULH's scalar variant 2020-04-22 20:53:46 +01:00
Lioncash
7ebfd0f31c ir: Add opcodes for scalar signed saturated doubling multiplies 2020-04-22 20:53:46 +01:00
Lioncash
9c03311fed A64: Implement SQDMULH's vector variant 2020-04-22 20:53:46 +01:00
Lioncash
a0231e5546 ir: Add opcodes for signed saturated doubling multiplies 2020-04-22 20:53:46 +01:00
Lioncash
db24e1f09b A64: Implement SQABS' scalar variant 2020-04-22 20:53:46 +01:00
Lioncash
bda5d14c7f A64: Implement SQABS' vector variant. 2020-04-22 20:53:46 +01:00
Lioncash
0507e47420 ir: Add opcodes for signed saturated absolute values 2020-04-22 20:53:46 +01:00
MerryMage
27427595b7 emit_x64_floating_point: EmitFPToFixed: maxsd optimization
maxsd is not required when doing a signed conversion, because x64
produces a 0x80...00 value for out of range values.
2020-04-22 20:53:46 +01:00
MerryMage
1abf82ac4a emit_x64_floating_point: ZeroIfNaN: pxor -> xorps
xorps is shorter and more appropriate here.
2020-04-22 20:53:46 +01:00
MerryMage
3415828fb4 IR: Simplify FP{Single,Double}ToFixed{U,S}{32,64} 2020-04-22 20:53:46 +01:00
Lioncash
e30f9816ec A32/decoder: Add missing <algorithm> includes
These includes should be present, as we use std::find_if() within these headers.
2020-04-22 20:53:46 +01:00
Lioncash
4507627905 emit_x64_vector: Provide AVX path for EmitVectorMinU64() 2020-04-22 20:53:46 +01:00
Lioncash
fd49a62b06 emit_x64_vector: Provide AVX path for EmitVectorMinS64() 2020-04-22 20:53:46 +01:00
Lioncash
770723f449 emit_x64_vector: Provide AVX path for EmitVectorMaxU64() 2020-04-22 20:53:46 +01:00
Lioncash
8fb90c0cf1 emit_x64_vector: Provide AVX path for EmitVectorMaxS64() 2020-04-22 20:53:46 +01:00
Lioncash
2cac6ad129 emit_x64_vector: Simplify EmitVectorLogicalLeftShift8()
Similar to EmitVectorLogicalRightShift8(), we can determine a mask ahead
of time and just and the results of a halfword left shift.
2020-04-22 20:53:46 +01:00
Lioncash
135107279d emit_x64_vector: Simplify EmitVectorLogicalShiftRight8()
We can generate the mask and AND it against the result of a halfword
shift instead of looping.
2020-04-22 20:53:46 +01:00
Lioncash
2952b46b16 emit_x64_vector: Amend value definition in SSE 4.1 path for EmitVectorSignExtend16()
We should be defining the value after the results have been calculated
to be consistent with the rest of the code.
2020-04-22 20:53:46 +01:00
Lioncash
fda19095ea emit_x64_vector: Remove fallback in EmitVectorSignExtend64()
This is fairly trivial to do manually.
2020-04-22 20:53:46 +01:00
Lioncash
39593fcd26 emit_x64_vector: Remove fallback for EmitVectorSignExtend32()
We can just do the extension manually, which gets rid of the need to
fall back here.
2020-04-22 20:53:46 +01:00
Lioncash
053175f69b ir_emitter: Rename fpscr_controlled parameters to fpcr_controlled
Part of addressing #333
2020-04-22 20:53:46 +01:00
MerryMage
f0184c4b8d a32/exception_generating: BPKT: Define unpredictable behaviour
Define unpredictable behaviour to be BKPT executes conditionally
2020-04-22 20:53:46 +01:00
MerryMage
a12854857b A32: Add define_unpredictable_behaviour option 2020-04-22 20:53:46 +01:00
MerryMage
b0abaa8312 A32/location_descriptor: Change formatting to use hex 2020-04-22 20:53:46 +01:00
MerryMage
ccbf6c7f63 microinstruction: A32ExceptionRaised causes CPU exception 2020-04-22 20:53:46 +01:00
MerryMage
6595e49a31 A32/types: CondToString: Add nv 2020-04-22 20:53:46 +01:00
MerryMage
d5b9c4a4bb block_of_code: Hide NX support behind compiler flag
Systems that require W^X can use the DYNARMIC_ENABLE_NO_EXECUTE_SUPPORT cmake option.
2020-04-22 20:53:46 +01:00
MerryMage
de4494ffa5 Implement perfmap 2020-04-22 20:53:46 +01:00
MerryMage
f73104633b a32_emit_x64: Fix incorrect BMI2 implementation for SetCpsr
* The MSB for each byte in cpsr_ge were not being appropriately set.
* We also expand test coverage to test this case.
* We fix the disassembly of the MSR (imm) and MSR (reg) instructions as well.
2020-04-22 20:53:46 +01:00
MerryMage
3432a08e0a backend/x64: Support W^X systems
Closes #176.
2020-04-22 20:53:46 +01:00
BreadFish64
2a65442933 Backend: Create "backend" folder
similar to the "frontend" folder
2020-04-22 20:53:46 +01:00
MerryMage
3b13f1eb12 A64/translate: Standardize arguments of helper functions
Don't pass in IREmitter when TranslatorVisitor is already available.
2020-04-22 20:53:45 +01:00
MerryMage
a4e556d59c A64/translate: Standardize TranslatorVisitor abbreviation
Prefer v to tv.
2020-04-22 20:53:45 +01:00
MerryMage
9a0dc61efd emit_x64_vector: Avoid recalculating addresses in EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
Lioncash
3d465e2c36 A64: Implement SQXTN, SQXTUN, and UQXTN's scalar variants
We can implement these in terms of the vector variants
2020-04-22 20:53:45 +01:00
Lioncash
4ff39c6ea8 A64: Implement SDOT and UDOT's (by element) variants
Gets all of the dot product instructions out of the way.
2020-04-22 20:53:45 +01:00
MerryMage
21df1fb539 emit_x64_vector: Don't load zero constant from memory in EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
3bbcca8757 emit_x64_vector: Special-case is_defaults_zero && table_size == 2 in EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
9cc00f900c emit_x64_vector: Release registers when possible in EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
a12afd1065 reg_alloc: Add the ability to Release an allocation early 2020-04-22 20:53:45 +01:00
MerryMage
e68bd3c6c1 emit_x64_vector: Special-case table_size == 1 in EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
a4e1f8a63a emit_x64_vector: SSE4.1 implementation of EmitVectorTableLookup 2020-04-22 20:53:45 +01:00
MerryMage
0c18b85c27 A64: Implement TBL and TBX 2020-04-22 20:53:45 +01:00
MerryMage
89d08c7d61 IR: Add VectorTable and VectorTableLookup IR instructions 2020-04-22 20:53:45 +01:00
MerryMage
0288974512 opcodes: Cleanup opcodes table
* Remove T:: prefix from types.
* Add another column for a 4th argument.
2020-04-22 20:53:45 +01:00
Lioncash
d9fc6cf31f A64: Implement SDOT and UDOT's vector variant 2020-04-22 20:53:45 +01:00
Lioncash
cb5e5c5d49 A64: Implement SADALP and UADALP
While we're at it we can join the code for SADDLP and UADDLP with these
instructions, since the only difference is we do an accumulate at the
end of the operation.
2020-04-22 20:53:45 +01:00
Lioncash
29f8b30634 A64: Implement SRSHL and URSHL
Implements both scalar and vector variants.
2020-04-22 20:53:45 +01:00
Lioncash
0efa2ce3b0 ir: Add opcodes for performing rounding left shifts 2020-04-22 20:53:45 +01:00
MerryMage
656ceff225 emit_x64_floating_point: Fix smallest normal check in EmitFPMulAdd 2020-04-22 20:53:45 +01:00
Lioncash
f3f60cd179 A64: Implement ISB
Given we want to ensure that all instructions are fetched again, we can
treat an ISB instruction as a code cache flush.
2020-04-22 20:53:45 +01:00
Lioncash
be53e356a2 A64: Implement FCVTN{2} 2020-04-22 20:53:45 +01:00
Lioncash
4c3d7c5a8d A64: Implement FCVTL{2} 2020-04-22 20:53:45 +01:00
Lioncash
7eb6be7a6a A64: Implement FMAXNM and FMINNM vector variants.
Currently we can implement these in terms of the scalar IR variants.
2020-04-22 20:53:45 +01:00
Lioncash
8b65ea68c0 A64: Implement FMAXP, FMAXNMP, FMINP, and FMINNMP's vector variants
We can just implement these in terms of scalars for the time being.
2020-04-22 20:53:45 +01:00
MerryMage
ec76f95f5a emit_x64_vector_floating_point: Correct value of smallest_normal_number 2020-04-22 20:53:45 +01:00
MerryMage
e60d6c0d20 fp/info: Incorrect point_position in FPValue 2020-04-22 20:53:45 +01:00
MerryMage
8a3b6364c2 load_store_exclusive: Define s == t state to be Constraint_NONE
Downstream (yuzu) mentioned that the instruction:

STXR W9, W9, [X0]

was executed in the program "Crash N-Sane Trilogy".
2020-04-22 20:53:45 +01:00
MerryMage
cd40e4dae0 A64/translate: Allow for unpredictable behaviour to be defined 2020-04-22 20:53:45 +01:00
MerryMage
d1d6f4feb5 system: Implement MRS CNTFRQ_EL0 2020-04-22 20:53:45 +01:00
Lioncash
7ef7def661 A64: Implement SQ{ADD, SUB}, and UQ{ADD, SUB}'s vector variants
Currently we implement these in terms of the scalar variants. Falling
back to the interpreter is slow enough to make it more effective than
doing that.
2020-04-22 20:46:23 +01:00
Lioncash
a4b0e2ace6 A64: Implement UQADD/UQSUB's scalar variants 2020-04-22 20:46:23 +01:00
Lioncash
acbaf04fef ir: Add opcodes for unsigned saturating add and subtract 2020-04-22 20:46:23 +01:00
Lioncash
c41b5a3492 x64/reg_alloc: Use type alias for array returned by GetArgumentInfo()
This way if the number ever changes, we don't need to change the type in
other places.
2020-04-22 20:46:23 +01:00
Lioncash
2188765e28 ir/value: Use type alias CoprocessorInfo for std::array<u8, 8>
Provides a more descriptive label for the interface, and avoids the need
to hardcode the array size in multiple places.
2020-04-22 20:46:23 +01:00
MerryMage
71e137715d status_register_access: Add support for bits 0 and 1 of mask to MSR 2020-04-22 20:46:23 +01:00
MerryMage
ac51c2547d A32/translate/load_store: Correct detection of writeback 2020-04-22 20:46:23 +01:00
MerryMage
d345220251 A32/translate: Add TranslateSingleInstruction 2020-04-22 20:46:23 +01:00
MerryMage
5fc197c564 A32/ir_emitter: Bug fix: IREmitter::ExceptionRaised using incorrect opcode 2020-04-22 20:46:23 +01:00
MerryMage
ff3805e332 A32/decoders: Split instruction list into include file 2020-04-22 20:46:23 +01:00
MerryMage
3f4d118d73 microinstruction: Improve assert messages 2020-04-22 20:46:23 +01:00
MerryMage
a7e6f2a235 emit_x64_vector: EmitVectorNarrow16: AVX512 implementation 2020-04-22 20:46:23 +01:00
MerryMage
b6350e3947 emit_x64_vector: EmitVectorNarrow32: prefer pblendw to loading constant 2020-04-22 20:46:23 +01:00
MerryMage
8fdba189cb emit_x64_vector: packusdw is SSE4.1 2020-04-22 20:46:23 +01:00
MerryMage
1ef388d1cd emit_x64_vector_floating_point: Simplify FPVector{Min,Max} 2020-04-22 20:46:23 +01:00
MerryMage
4a1ce797cb emit_x64_vector_floating_point: Simplify Get*Vector functions 2020-04-22 20:46:23 +01:00
MerryMage
bcaced297a emit_x64_floating_point: Remove EmitProcessNaNs 2020-04-22 20:46:23 +01:00
MerryMage
2e0885388e devirtualize: Replace DEVIRT macro with function template 2020-04-22 20:46:23 +01:00
Lioncash
54d8552177 a32_emit_x64: std::move A32::UserConfig in the constructor
This avoids a few redundant atomic increments and decrements,
considering the UserConfig instance contains a std::array of
std::shared_ptr<Coprocessor> instances.
2020-04-22 20:46:23 +01:00
MerryMage
b098c650df emit_x64_floating_point: Use EmitPostProcessNaNs in EmitFPMulX 2020-04-22 20:46:23 +01:00
MerryMage
c1babf41b2 emit_x64_floating_point: Remove unnecessary DenormalsAreZero from EmitFPSingleToDouble and EmitFPDoubleToSingle 2020-04-22 20:46:23 +01:00
MerryMage
700088408d emit_x64_floating_point: Simplify EmitFP{Min,Max}{,Numeric}{32,64} 2020-04-22 20:46:23 +01:00
MerryMage
07e0585994 emit_x64_floating_point: Reduce NaN processing overhead 2020-04-22 20:46:23 +01:00
MerryMage
f5e11d117a A64: Implement FMULX, scalar single/double variant 2020-04-22 20:46:23 +01:00
MerryMage
17f73974f2 IR: Implement FPMulX IR instruction 2020-04-22 20:46:23 +01:00
Lioncash
391e16be64 emit_x64_vector: Vectorize 32-bit variants of paired min/max
Gets rid of the fallbacks for these cases.
2020-04-22 20:46:23 +01:00
MerryMage
5ae045d67e emit_x64_vector: Improve code emission of VectorGetElement* for index == 0 2020-04-22 20:46:23 +01:00
MerryMage
e9ab7f7664 reg_alloc: Do a UseScratch if a Use destination is too small 2020-04-22 20:46:23 +01:00
MerryMage
90f8dda966 emit_x64_floating_point: AVX implementation of ForceToDefaultNaN 2020-04-22 20:46:23 +01:00
MerryMage
dfb660cd16 emit_x64_vector_floating_point: Prefer blendvp{s,d} to vblendvp{s,d} where possible
It's a cheaper instruction.
2020-04-22 20:46:23 +01:00
MerryMage
476c0f15da backend_x64: Remove all use of xmm0 2020-04-22 20:46:23 +01:00
MerryMage
8252efd7b1 emit_x64_vector_floating_point: AVX implementation of ForceToDefaultNaN 2020-04-22 20:46:23 +01:00
MerryMage
746dc521b9 emit_x64_vector_floating_point: Reduce codesize of ForceToDefaultNaN 2020-04-22 20:46:23 +01:00
MerryMage
7731dcdca9 emit_x64_vector_floating_point: Reduce codesize of EmitTwoOpVectorOperation 2020-04-22 20:46:23 +01:00
MerryMage
bb93353f94 emit_x64_vector_floating_point: Correct FMA in FTZ mode
x64 rounds before flushing to zero
AArch64 rounds after flushing to zero

This difference of behaviour is noticable if something would round to a smallest normalized number
2020-04-22 20:46:23 +01:00
MerryMage
8ef195db3c emit_x64_floating_point: DenormalsAreZero is redundant as hardware already does DAZ
Exceptions: F{MIN,MAX}{,NM}
2020-04-22 20:46:23 +01:00
MerryMage
de9d8c461c emit_x64_floating_point: FlushToZero is redundant as hardware already does FTZ 2020-04-22 20:46:23 +01:00
MerryMage
822fd4a875 backend_x64: Fix FPVectorMulAdd and FPMulAdd NaN handling with denormals
Denormals should be treated as zero in NaN handler
2020-04-22 20:46:23 +01:00
MerryMage
b393e15ab6 backend_x64: Fix bugs when FPCR.FZ=1
Bugs:
* DenormalsAreZero flushed to positive zero instead of preserving sign.
* FMAXNM/FMINNM (scalar) should perform DAZ *before* special zero handling.
* FMAX/FMIN/FMAXNM/FMINNM (vector) did not DAZ.
2020-04-22 20:46:23 +01:00
MerryMage
5e88d66470 fp/info: Deduplicate functions 2020-04-22 20:46:23 +01:00
MerryMage
2019d32743 emit_x64_floating_point: Deduplicate EmitFPMulAdd implementation 2020-04-22 20:46:23 +01:00
MerryMage
e038fe72df emit_x64_floating_point: Deduplicate code 2020-04-22 20:46:23 +01:00
MerryMage
ec82a845b7 emit_x64_vector_floating_point: Fix FPVector{Max,Min} when FPCR.DN = 1 2020-04-22 20:46:23 +01:00
MerryMage
7f27945411 emit_x64_floating_point: Fix FP{Max,Min} when FPCR.DN = 1 2020-04-22 20:46:23 +01:00
MerryMage
21a28c2545 IR: SSE4.1 implementation of FPVectorRoundInt 2020-04-22 20:46:23 +01:00
MerryMage
9669e49817 A64: Implement FRINT{N,M,P,Z,A,X,I} (vector), single/double variant 2020-04-22 20:46:23 +01:00
MerryMage
f976c47008 IR: Initial implementation of FPVectorRoundInt 2020-04-22 20:46:23 +01:00
MerryMage
f2393488fe A64: Implement SQADD and SQSUB, scalar variant 2020-04-22 20:46:23 +01:00
MerryMage
10e196480f IR: Generalise SignedSaturated{Add,Sub} to support more bitwidths 2020-04-22 20:46:23 +01:00
MerryMage
71db0e67ae a64_emit_x64: Bugfix EmitA64OrQC - Incorrect argument 2020-04-22 20:46:23 +01:00
Lioncash
d0fdd3c6e6 simd_three_same: Extract non-paired SMAX, SMIN, UMAX, UMIN code to a common function
Deduplicates a bit of code and makes its layout consistent with the
paired variants
2020-04-22 20:46:23 +01:00
Lioncash
2bea2d0512 A64: Implement SMAXP, SMINP, UMAXP, UMINP 2020-04-22 20:46:23 +01:00
Lioncash
463b9a3d02 ir: Add opcodes for vector paired maximum and minimums
For the time being, we can just do a naive implementation which avoids
falling back to the interpreter a bit. Horizontal operations aren't
necessarily x86 SIMD's forte anyways.
2020-04-22 20:46:23 +01:00
Lioncash
43344c5400 A64: Implement SMAXV, SMINV, UMAXV, and UMINV 2020-04-22 20:46:23 +01:00
Lioncash
2501bfbfae ir: Add opcodes for performing scalar integral min/max 2020-04-22 20:46:23 +01:00
Lioncash
7fdd8b0197 A64: Implement PMULL{2} 2020-04-22 20:46:23 +01:00
Lioncash
5ebf496d4e translate: Deduplicate GetDataSize() functions
Avoids defining the same function multiple times in different files.
2020-04-22 20:46:22 +01:00
Lioncash
f83cd2da9a floating_point_{conditional}_compare: Deduplicate code
Deduplicates the implementation code of instructions by extracting the
code to a common function.
2020-04-22 20:46:22 +01:00
MerryMage
f9c6d5e1a0 common: Move all cryptographic function to common/crypto 2020-04-22 20:46:22 +01:00
MerryMage
5dc23e49d7 a32_emit_x64: BMI2 implementation of A32SetCpsr 2020-04-22 20:46:22 +01:00
MerryMage
0f85305933 a32_emit_x64: Shorten EmitA32GetCpsr 2020-04-22 20:46:22 +01:00
MerryMage
9fe2bf8733 a32_emit_x64: Assert that memory layout assumption in EmitA32GetCpsr is valid 2020-04-22 20:46:22 +01:00
Lioncash
b48fb8ca6b A64: Implement PMUL 2020-04-22 20:46:22 +01:00
Lioncash
affa312d1d ir: Add opcode for performing polynomial multiplication 2020-04-22 20:46:22 +01:00
MerryMage
dd4ac86f8e A64: Implement FCVT{N,M,A,P}{U,S} (vector), FCVTZU (vector, integer), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
28b38916a8 A64: Implement FCVTZS (vector, integer), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
507bcd8b8b IR: Implement FPVectorTo{Signed,Unsigned}Fixed 2020-04-22 20:46:22 +01:00
MerryMage
8f75a1fe04 fp/info: Replace constant value generators with FPValue
Instead of having multiple different functions we can just have one.
2020-04-22 20:46:22 +01:00
MerryMage
da261772ea emit_x64_vector_floating_point: AVX implementation of FPVector{Max,Min} 2020-04-22 20:46:22 +01:00
MerryMage
a0d6f0de57 emit_x64_vector_floating_point: Remove unnecessary double jump in HandleNaNs 2020-04-22 20:46:22 +01:00
Lioncash
c778c7b868 A64: Implement FMAX's vector single and double precision variants 2020-04-22 20:46:22 +01:00
Lioncash
009879d92b A64: Implement FMIN's vector single and double precision variants 2020-04-22 20:46:22 +01:00
MerryMage
7b03da86c2 IR: Implement FPVector{Max,Min} 2020-04-22 20:46:22 +01:00
MerryMage
e76e1186bb FPRecipEstimate: Move offset out of function
MSVC has weird lambda capturing rules.
2020-04-22 20:46:22 +01:00
MerryMage
ddcff86f9c microinstruction: Update ReadsFromAndWritesToFPSRCumulativeExceptionBits 2020-04-22 20:46:22 +01:00
MerryMage
10de36394e A64: Implement FRECPS, vector/scalar single/double variants 2020-04-22 20:46:22 +01:00
MerryMage
901bd9b4e2 IR: Implement FPRecipStepFused, FPVectorRecipStepFused 2020-04-22 20:46:22 +01:00
MerryMage
f66f61d8ab A64: Implement FRECPE, vector single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
939f5f5c7a IR: Implement FPVectorRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
27c73dd56a A64: Implement FRECPE, scalar single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
fc2d33ae7b IR: Implement FPRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
c1dcfe29f7 IR: Implement FPRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
7a673a8a43 fp: Change FPUnpacked to a normalized representation
Having a known position for the highest set bit makes writing algorithms easier
2020-04-22 20:46:22 +01:00
MerryMage
3fe45c6d8e block_of_code: Add ABI_PARAMS array 2020-04-22 20:46:22 +01:00
MerryMage
642b6c31d2 A64: Implement MLA, MLS (by element), vector single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
0de37b11ad A64: Implement FMLS (vector), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
64c2f698a2 emit_x64_vector_floating_point: Specify NanHandler::function_type explicitly
MSVC doesn't like dealing with auto return types
2020-04-22 20:46:22 +01:00
MerryMage
2ef59b4f03 emit_x64_vector_floating_point: ChooseOnFsize arguments maybe_unused 2020-04-22 20:46:22 +01:00
MerryMage
04f325a05e IR: Implement FPVectorNeg 2020-04-22 20:46:22 +01:00
MerryMage
934132e0c5 A64: Implement FMLA (vector), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
771a4fc20b IR: Implement FPVectorMulAdd 2020-04-22 20:46:22 +01:00
MerryMage
3218bb9890 emit_x64_vector_floating_point: Standardize naming scheme 2020-04-22 20:46:22 +01:00
MerryMage
8f72be0a02 emit_x64_floating_point: Simplify indexers 2020-04-22 20:46:22 +01:00
MerryMage
25b28bb234 emit_x64_vector_floating_point: Simplify EmitVectorOperation* 2020-04-22 20:46:22 +01:00
MerryMage
1edd0125b2 mp: rename mp.h to mp/function_info.h 2020-04-22 20:46:22 +01:00
MerryMage
0921678edb emit_x64_vector: Slightly improve ArithmeticShiftRightByte 2020-04-22 20:46:22 +01:00
MerryMage
43407c4bb4 emit_x64_vector: Simplify VectorShuffleImpl 2020-04-22 20:46:22 +01:00
MerryMage
ecbf9dbae5 IR: Implement A64OrQC 2020-04-22 20:46:22 +01:00
MerryMage
f0fecf2615 A64: Implement UQSHRN, UQRSHRN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
8f4c1a8558 emit_x64_vector: -0x80000000 isn't -0x80000000 2020-04-22 20:46:22 +01:00
MerryMage
b455b566e7 A64: Implement UQXTN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
e686a81612 emit_x64_vector: Fix non-SSE4.1 saturated narrowing reconstruction comparison
Allows non-SSE4.1 to produce the correct FPSR.QC flag
2020-04-22 20:46:22 +01:00
MerryMage
3874cb37e3 A64: Implement SQXTN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
8ef114d48f emit_x64_vector: packusdw reqiures SSE4.1
In EmitVectorSignedSaturatedNarrowToUnsigned32.
2020-04-22 20:46:22 +01:00
MerryMage
712c6c1d7e A64: Implement SQSHRUN, SQRSHRUN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
c5722ec963 simd_shift_by_immediate: Simplify ShiftRight 2020-04-22 20:46:22 +01:00
MerryMage
f020dbe4ed A64: Implement SQXTUN 2020-04-22 20:46:22 +01:00
MerryMage
6918ef7360 microinstruction: Reorganize FPSCR related instruction queries 2020-04-22 20:46:22 +01:00
Lioncash
a639fa5534 microinstruction: Add missing FP scalar opcodes to ReadsFromFPSCR() and WritesToFPSCR()
These were forgotten when the opcodes were added.
2020-04-22 20:46:22 +01:00
Lioncash
3ca18d8a6d u128: Make Bit() a const-qualified member function
This function doesn't modify the struct members, so it can be made
const.
2020-04-22 20:46:22 +01:00
MerryMage
b2e4c16ef8 A64: Implement FRSQRTS (vector), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
45dc5f74f3 A64: Implement FRSQRTE (vector), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
b74d5520f9 A64: Implement FRSQRTS (scalar), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage
506e544bfe IR: Implement FPRSqrtStepFused 2020-04-22 20:46:22 +01:00
MerryMage
6eb069e80d fp: Implement FPRSqrtStepFused 2020-04-22 20:46:22 +01:00
MerryMage
b0ff35fcd1 fp: Implement FPNeg 2020-04-22 20:46:22 +01:00
MerryMage
ca6774ccce process_nan: Add two operand variant 2020-04-22 20:46:22 +01:00
Lioncash
ace7d2ba50 A64: Implement FMAXP, FMINP, FMAXNMP and FMINNMP's scalar double/single-precision variant 2020-04-22 20:46:21 +01:00
MerryMage
66bb05fc0a emit_x64_floating_point: Fixup special NaN case in FMA FPMulAdd implementation 2020-04-22 20:46:21 +01:00
Lioncash
070637e0f6 fp: Use a forward declaration in fused.h
It's permissible to forward declare here, so we can do so and eliminate
a direct header dependency
2020-04-22 20:46:21 +01:00
Lioncash
030820f649 u128: Implement comparison operators in terms of one another
We can just implement the comparisons in terms of operator< and
implement inequality with the negation of operator==.
2020-04-22 20:46:21 +01:00
MerryMage
76b07d6646 u128: StickyLogicalShiftRight requires special-casing for amount == 64
In this case (128 - amount) == 64, and this invokes undefined behaviour
2020-04-22 20:46:21 +01:00
Lioncash
49c7edf7c6 A64: Implement FMLA and FMLS (by element)'s double/single-precision scalar variant 2020-04-22 20:46:21 +01:00
Lioncash
c704acafe4 A64: Implement FMUL (by element)'s scalar double/single-precision variant 2020-04-22 20:46:21 +01:00
MerryMage
0ce11b7b15 emit_x64_floating_point: Implement accurate fallback for FPMulAdd{32,64} 2020-04-22 20:46:21 +01:00
MerryMage
e199887fbc fp: Implement FPMulAdd 2020-04-22 20:46:21 +01:00
MerryMage
53a8c15d12 process_nan: Add FPProcessNaNs3 2020-04-22 20:46:21 +01:00
MerryMage
1c8e93e74d block_of_code: Add SysV ABI fifth and sixth parameters 2020-04-22 20:46:21 +01:00
MerryMage
1fe8f51c54 u128: Add StickyLogicalShiftRight 2020-04-22 20:46:21 +01:00
MerryMage
b0afd53ea7 u128: Add Multiply64To128 2020-04-22 20:46:21 +01:00
MerryMage
5566fab29a u128: Add u128::Bit 2020-04-22 20:46:21 +01:00
MerryMage
3e62fea003 u128: Add comparison operators 2020-04-22 20:46:21 +01:00
MerryMage
f17cd6f2c5 unpacked: Use ResidualErrorOnRightShift in FPRoundBase
Fixes a bug relating to exponents that are severely out of range.
2020-04-22 20:46:21 +01:00
MerryMage
805428e35e fp: Remove MantissaT 2020-04-22 20:46:21 +01:00
MerryMage
bda86fd167 FPRSqrtEstimate: Improve documentation of RecipSqrtEstimate 2020-04-22 20:46:21 +01:00
Lioncash
0a64a66b26 FPRSqrtEstimate: Deduplicate array bounds
Dehardcodes a few constants in the loops.
2020-04-22 20:46:21 +01:00
Lioncash
b7bd70fd19 A64: Implement FMAXV, FMINV, FMAXNMV, and FMINNMV 2020-04-22 20:46:21 +01:00
Lioncash
664fb12e21 FPRSqrtEstimate: Use forward declarations where applicable 2020-04-22 20:46:21 +01:00
Lioncash
3447c82656 translate: Return by bool in helpers where applicable
Gets rid of a bit of duplication regarding the early-out cases and makes
all helpers functions consistent (previously some had a return type of
bool, while others had a return type of void).
2020-04-22 20:46:21 +01:00
Lioncash
d65b056eba Simplify fallback case for EmitVectorSetElement64() 2020-04-22 20:46:21 +01:00
MerryMage
6087c2af6f emit_x64_floating_point: s/Esimate/Estimate/ 2020-04-22 20:46:21 +01:00
MerryMage
f837ce8e78 simd_scalar_two_register_misc: Implement FRSQRTE, scalar variant 2020-04-22 20:46:21 +01:00
MerryMage
bde58b04d4 IR: Implement FPRSqrtEstimate 2020-04-22 20:46:21 +01:00
MerryMage
16061c28f3 simd_vector_x_indexed_element: Implement FMUL (by element), vector variant 2020-04-22 20:46:21 +01:00
MerryMage
55eaa16615 a64_emit_x64: Ensure host has updated ticks in EmitA64GetCNTPCT
Discovered by @Subv.
Fixes incomplete fix begun in 5a91c94dca47c9702dee20fbd5ae1f4c07eef9df.
That fix fails to take into account that LinkBlock doesn't update ticks until there
are no remaining ticks to be executed.

Test added to confirm fix.
2020-04-22 20:46:21 +01:00
MerryMage
edd795e991 a64_emit_x64: Fix stack misalignment on Windows for 128-bit exclusive writes
Discovered by @Subv.
Includes a test to ensure this codepath is exercised on Windows.
2020-04-22 20:46:21 +01:00
Lioncash
04b4c8b0cf emit_x64_aes: Eliminate extraneous usage of a scratch register in EmitAESInverseMixColumns()
We can just use the same register the data is in as the result register,
eliminating the need to use a completely separate register to store the
result.
2020-04-22 20:46:21 +01:00
Lioncash
e5d80e998e A64: Implement SADDLV 2020-04-22 20:46:21 +01:00
Lioncash
a1bc8ddb53 A64: Implement UADDLV 2020-04-22 20:46:21 +01:00
Lioncash
1dc1e3dcd8 fp: Use forward declarations where applicable
Minimizes the amount of files that need to be rebuilt if the headers
ever change.
2020-04-22 20:46:21 +01:00
Lioncash
46cb0d813b emit_x64_vector: Append 'v' prefix onto movq in AVX path
This is something I missed when adding in the AVX broadcast code.
2020-04-22 20:46:21 +01:00
Subv
4606a081c9 A64: The A64SetTPIDR IR instruction writes to a system register and should not be eliminated by the dead code elimination pass.
Previously this instruction was alway eliminated, resulting in incorrect values for TPIDR_EL0.
2020-04-22 20:46:21 +01:00
MerryMage
b53127600b fp: A64::FPCR -> FP::FPCR 2020-04-22 20:46:21 +01:00
MerryMage
084bf63a10 bit_util: Implement ClearBits and ModifyBits 2020-04-22 20:46:21 +01:00
MerryMage
699c5f36d5 system: Simplify static_cast 2020-04-22 20:46:21 +01:00
MerryMage
3f602129f4 system: Ensure value of CNTPCT_EL0 is accurate
Since we currently only update the host's tick count at the end of a
block, we force an end-of-block before executing a MRS %, CNTPCT_ELO
instruction.
2020-04-22 20:46:21 +01:00
Lioncash
84affdb260 safe_ops: Avoid cases where shift bases are invalid with signed values
For example, say the converted signed type is s64, shifting left  by 63
bits would be undefined behavior.

However, given an ASL is essentially the same behavior as an LSL
we can just use an unsigned type instead of converting to a signed type.
2020-04-22 20:46:21 +01:00
Lioncash
d0274f412a safe_ops: Avoid signed overflow in Negate()
Negation of values such as -9223372036854775808 can't be represented in
signed equivalents (such as long long), leading to signed overflow.
Therefore, we can just invert bits and add 1 to perform this behavior
with unsigned arithmetic.
2020-04-22 20:46:21 +01:00
Lioncash
af3e23b224 simd_scalar_shift_by_immediate: Implement FCVT{ZS, ZU} (vector, fixed-point)'s scalar double/single-precision variant 2020-04-22 20:46:21 +01:00
Lioncash
91abf87169 simd_scalar_two_register_misc: Implement FCVT{AS, AU, MS, MU, NS, NU, PS, PU, ZS, ZU} (vector)'s scalar double/single-precision variants
We can simply implement this in terms of the fixed-point IR opcodes.
2020-04-22 20:46:21 +01:00
Lioncash
0ec8dac660 emit_x64: Remove FPSCR_RoundTowardsZero() virtual function from EmitContext struct
This code was bugged in that we were comparing if the rounding mode was
not equal to rounding towards zero. Fortunately, however, nothing uses
this function anymore, and there's already the more general
FPSCR_RMode() available, so this can be removed entirely.
2020-04-22 20:46:21 +01:00
Lioncash
fd92e2f186 emit_x64: Add missing <array> include
Commit 755adef62e504a8d616de9dda8937d2428a9471b introduced a helper
alias for std::array, eliminating the need to manually type out sizes
for them, however I forgot to add the include for <array>
2020-04-22 20:46:21 +01:00
Lioncash
f939bd0228 emit_x64_vector{_floating_point}: Add helper alias for sizing arrays relative to vector width
Avoids needing to remember to specify the proper size of the arrays, all
that's needed is to specify the type of the array and the size will
automatically be deduced from it. This helps prevent potential oversized
or undersized arrays from being specified.
2020-04-22 20:46:21 +01:00
MerryMage
58f3399032 A64/PopRSBHint: Prevent RETing to a guest PC of ~0ull from crashing the jit 2020-04-22 20:46:21 +01:00
MerryMage
e18fca17dc A64: Implement FABD in terms of existing IR instructions
Fixes NaN issue. Closes #306.
2020-04-22 20:46:21 +01:00
MerryMage
1dbe9d95e6 FPRoundInt: Final FPRound based on new sign
While this shouldn't change any of the results in theory, it's just logically more consistent
2020-04-22 20:46:21 +01:00
MerryMage
83be491875 emit_x64_floating_point: SSE4.1 implementation of EmitFPRound 2020-04-22 20:46:20 +01:00
MerryMage
a40127a054 A64: Implement FRINTX, FRINTI (scalar) 2020-04-22 20:46:20 +01:00
MerryMage
962fa3b65e A64: Implement FRINTP, FRINTM, FRINTZ (scalar) 2020-04-22 20:46:20 +01:00
MerryMage
5200bf41cf A64: Implement FRINTN (scalar) 2020-04-22 20:46:20 +01:00
MerryMage
8718dc1692 A64: Implement FRINTA (scalar) 2020-04-22 20:46:20 +01:00
MerryMage
b228694012 IR: Implement FPRoundInt 2020-04-22 20:46:20 +01:00
MerryMage
e24054f4d7 fp: Implement FPRoundInt 2020-04-22 20:46:20 +01:00
MerryMage
f876e4afa2 fp: Implement FPProcessNaN 2020-04-22 20:46:20 +01:00
MerryMage
591adee443 fp/info: Add DefaultNaN 2020-04-22 20:46:20 +01:00
MerryMage
797e18cd97 fp: Move FPToFixed to its own file 2020-04-22 20:46:20 +01:00
MerryMage
295deb4035 a64_jit_state: Add FPSR.QC flag 2020-04-22 20:46:20 +01:00
Lioncash
7797bc2fb2 emit_x64_vector: Use non-scratch Use* variants of registers within EmitVectorUnsignedAbsoluteDifference()
In some cases, a register isn't modified, depending on the branch taken,
so we can signify this by using the non-scratch variants in certain
cases.
2020-04-22 20:46:20 +01:00
Lioncash
f7f83b76b7 simd_scalar_two_register_misc: Implement scalar double/single-precision variants of FCM{EQ, GE, GT, LE, LT} (zero) 2020-04-22 20:46:20 +01:00
Lioncash
9db6d1e98b translate_arm: Remove unnecessary rotr() function
We already have RotateRight() in our common code, so we can remove this
function and replace it with it. We can also implement ArmExpandImm_C()
in terms of ArmExpandImm().
2020-04-22 20:46:20 +01:00
Lioncash
9f8a44c982 cast_util: Remove unnecessary typename
Given we use std::aligned_storage_t, we don't need to specify
typename here. If we used std::aligned_storage, then we would need to.
2020-04-22 20:46:19 +01:00
MerryMage
89e43867c1 A64: Implement FADDP (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
33fa65de23 A64: Implement FADDP (vector) 2020-04-22 20:46:19 +01:00
MerryMage
9dba273a8c A64: Implement SADDLP 2020-04-22 20:46:19 +01:00
MerryMage
70ff2d73b5 A64: Implement UADDLP 2020-04-22 20:46:19 +01:00
MerryMage
5563bbbd79 A64: Implement EXT 2020-04-22 20:46:19 +01:00
MerryMage
304cc7f61e emit_x64_floating_point: SSE4.1 implementation for FP{Double,Single}ToFixed{S,U}{32,64} 2020-04-22 20:46:19 +01:00
MerryMage
3d9677d094 A64: Implement FCVTMU (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
79c9018d60 A64: Implement FCVTMS (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
49c4499a87 A64: Implement FCVTPU (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
af661ef5a6 A64: Implement FCVTPS (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
27319822bb A64: Implement FCVTAU (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
c0c7a26314 A64: Implement FCVTAS (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
a1965a74a0 A64: Implement FCVTNU (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
7d36dbcdfd A64: Implement FCVTNS (scalar) 2020-04-22 20:46:19 +01:00
MerryMage
617ca0adf0 floating_point_conversion_integer: Refactor implementation of FCVTZS_float_int and FCVTZU_float_int 2020-04-22 20:46:19 +01:00
MerryMage
caaf36dfd6 IR: Initial implementation of FP{Double,Single}ToFixed{S,U}{32,64}
This implementation just falls-back to the software floating point implementation.
2020-04-22 20:46:19 +01:00
MerryMage
760cc3ca89 EmitContext: Expose FPCR 2020-04-22 20:46:19 +01:00
MerryMage
9571269552 fp/op: Implement FPToFixed 2020-04-22 20:46:19 +01:00
MerryMage
8087e8df05 mantissa_util: Implement ResidualErrorOnRightShift
Accurately calculate residual error that is shifted out
2020-04-22 20:46:19 +01:00
MerryMage
8668d61881 fp/unpacked: Implement FPRound 2020-04-22 20:46:19 +01:00
MerryMage
55d590c01f FPCR: Add AHP setter and FZ16 getter 2020-04-22 20:46:19 +01:00
MerryMage
7360a2579b mp: Implement metaprogramming library 2020-04-22 20:46:19 +01:00
MerryMage
4ab029c114 fp: Implement FPUnpack 2020-04-22 20:46:19 +01:00
MerryMage
4875658917 fp: Implement FPProcessException 2020-04-22 20:46:19 +01:00
MerryMage
3cb98e1560 fp: Move fp_util to fp/util 2020-04-22 20:46:19 +01:00
MerryMage
c41a38b13e fp: Add FPSR 2020-04-22 20:46:19 +01:00
MerryMage
66381352f3 fp: Add FPInfo
Provides information about floating-point format for various bit sizes
2020-04-22 20:46:19 +01:00
MerryMage
d21659152c safe_ops: Implement safe shifting operations
Implement shifiting operations that perform consistently across architectures
without running into undefined or implemented-defined behaviour.
2020-04-22 20:46:19 +01:00
MerryMage
b00fe23b91 bit_util: Implement MostSignificantBit 2020-04-22 20:46:19 +01:00
MerryMage
95ad0d0a66 bit_util: Use Ones to implement Bits 2020-04-22 20:46:19 +01:00
MerryMage
62b640b2fa bit_util: Add ClearBit and ModifyBit 2020-04-22 20:46:19 +01:00
MerryMage
8651c2d10e u128: Implement u128
For when we need a 128-bit integer
2020-04-22 20:46:19 +01:00
Lioncash
e7409fdfe4 A64: Implement UCVTF (vector, integer)'s double/single-precision variant 2020-04-22 20:46:19 +01:00
Lioncash
4aa4885ba7 ir: Add opcodes for vector conversion of u32/u64 to floating-point 2020-04-22 20:46:19 +01:00
Lioncash
fcae4e2418 simd_three_different: Deduplicate common implementations
Generally, the only difference between the signed variants and the
unsigned variants is whether or not we use a sign-extension or
zero-extension, so we can simply use common functions to implement both
cases without totally duplicating code twice here.
2020-04-22 20:46:19 +01:00
Lioncash
9c0d5cf15c floating_point_conversion_integer: Handle S64/U64 -> F32 conversions in SCVTF_float_int and UCVTF_float_int 2020-04-22 20:46:19 +01:00
Lioncash
7a84b6e8d8 ir: Add opcodes for converting S64 and U64 to single-precision floating-point values 2020-04-22 20:46:19 +01:00
Lioncash
066061fa50 constant_pool: Remove unnecessary std::memset from constructor
AllocateFromCodeSpace() already zeroes out the allocated memory.
2020-04-22 20:46:19 +01:00
Lioncash
a1d6a86e8c A64: Implement ADDV 2020-04-22 20:46:19 +01:00
Lioncash
35026a6ce3 emit_x64_vector: Vectorize fallback path for EmitVectorMaxU32() 2020-04-22 20:46:19 +01:00
Lioncash
245c903129 simd_three_same: Join FPAbsoluteComparison() into FPCompareRegister()
These are part of the same comparison family, so there's no real point
in keeping them separate.
2020-04-22 20:46:19 +01:00
Lioncash
9912836b59 A64: Implement scalar double/single-precision variants of FACGE, FACGT, FCMEQ, FCMGE, FCMGT 2020-04-22 20:46:18 +01:00
MerryMage
0b97e9bd8d emit_x64_floating_point: Fix EmitFPU64ToDouble for TowardsMinusInfinity rounding mode 2020-04-22 20:46:18 +01:00
MerryMage
a2eb9a02e0 backend_x86: Add FPSCR_RMode to EmitContext 2020-04-22 20:46:18 +01:00
MerryMage
d875c08ebf fp: Extract common RoundingMode enum 2020-04-22 20:46:18 +01:00
Lioncash
3714bc0ed4 floating_point_conversion_integer: Use FPS64ToDouble and FPU64ToDouble in SCVTF_float_int and UCVTF_float_int
The opcodes introduced in 979b6f39f1621b80bd463645ec5b08661cb6b1bf can
also be used here, avoiding more falling back to the interpreter.
2020-04-22 20:46:18 +01:00
Lioncash
b97358075e simd_scalar_two_register_misc: Handle 64-bit case in SCVTF and UCVTF's scalar double/single-precision variant
Avoids falling back to the interpreter in the 64-bit case.
2020-04-22 20:46:18 +01:00
Lioncash
7252293184 emit_x64_floating_point: Correct use of UseGpr() in EmitFPU32ToDouble() and EmitFPU32ToSingle()
In the non-AVX512 path, the following code is present:

code.mov(from.cvt32(), from.cvt32());

since this potentially modifies 'from', we should be using
UseScratchGpr() instead.
2020-04-22 20:46:18 +01:00
Lioncash
fbd7623fe5 emit_x64_floating_point: Add AVX512F conversion operations to EmitFPU32ToSingle() and EmitFPU32ToDouble()
AVX-512F provides convenient instructions for these kinds of conversions
directly
2020-04-22 20:46:18 +01:00
Lioncash
3a41465eaf ir: Add opcodes for converting S64 and U64 to double-precision values 2020-04-22 20:46:18 +01:00
MerryMage
436ca80bcd Merge branch 'global_monitor' 2020-04-22 20:46:18 +01:00
Lioncash
0f4bf26e05 simd_two_register_misc: Utilize FPVectorAbs in FABS implementations
Since we already have opcodes introduced to implement FACGE and FACGT,
we can reutilize it for the FABS implementations.
2020-04-22 20:46:18 +01:00
MerryMage
821cff1227 A64: Add ClearExclusiveState method 2020-04-22 20:46:18 +01:00
Lioncash
81e572c78c ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16 2020-04-22 20:46:18 +01:00
MerryMage
2a8de5f733 a64_emit_x64: Clear exclusive state in EmitA64CallSupervisor
The kernel would have to execute an ERET instruction to return to
userland; this clears exclusive state.
2020-04-22 20:46:18 +01:00
Lioncash
53dbb6a92a A64: Implement FACGE's vector single/double precision variants 2020-04-22 20:46:18 +01:00
MerryMage
57f7c7e1b0 Implement global exclusive monitor 2020-04-22 20:46:18 +01:00
Lioncash
6912a02d9b A64: Implement FACGT's vector single/double precision variants 2020-04-22 20:46:18 +01:00
MerryMage
85234338d3 a64_emit_x64: Simplify EmitExclusiveWrite 2020-04-22 20:46:18 +01:00
Lioncash
fc731dddae ir: Add opcodes for performing vector absolute floating-point values
This will be usable for implementing FACGE and FACGT
2020-04-22 20:46:18 +01:00
MerryMage
2fc6b33829 CMakeLists: Add missing files 2020-04-22 20:46:18 +01:00
Lioncash
0bee648b4f emit_x64_vector: Deduplicate a bit of code in EmitVectorSetElement{8, 32, 64} functions
Given both branches are the same, we can hoist out the common code.
2020-04-22 20:46:18 +01:00
Lioncash
d86fea0d28 A64: Implement FCMEQ (zero)'s vector single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
593eca7fb1 A64: Implement load/store single structure instructions
Implements LD{1, 2, 3, 4}, LD{1, 2, 3, 4}R, and ST{1, 2, 3, 4} single
structure variants.
2020-04-22 20:46:18 +01:00
Lioncash
9bec354791 A64: Implement FCMEQ (register)'s vector single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
b6e223fc58 emit_x64_vector: Deduplicate a bit of code within EmitVectorGetElement8()
Given both branches use the same destination register size, we can hoist
the common code out.
2020-04-22 20:46:18 +01:00
Lioncash
5ce187a54e ir: Add opcodes for floating-point vector equalities 2020-04-22 20:46:18 +01:00
MerryMage
be354dbfd0 ir/basic_block: Add missing U16 immediate type to DumpBlock 2020-04-22 20:46:18 +01:00
Lioncash
cf188448d4 emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64()
Gets rid of the need to perform a fallback.
2020-04-22 20:46:18 +01:00
MerryMage
5503ff28c3 llvm_disassemble: Allow disassembly of invalid AArch64 instructions 2020-04-22 20:46:18 +01:00
Lioncash
954deff2d4 emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned()
This doesn't alter behavior but does make the code better if anything
else is ever added to this function in the future.
2020-04-22 20:46:18 +01:00
Lioncash
11a92eaaef A64: Implement SRHADD and URHADD 2020-04-22 20:46:18 +01:00
Lioncash
9e75d08860 A64: Implement FABD's scalar single/double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
bc718c5b28 ir: Add opcodes for performing rounding halving adds 2020-04-22 20:46:18 +01:00
Lioncash
d898d1779d A64: Implement FABD's vector single/double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
054549da35 emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64
I realized I introduced a helper for simple AVX operation emitting, so
use that instead of writing it all out long-form.
2020-04-22 20:46:18 +01:00
Lioncash
8a4f8aed06 ir: Add opcode for performing FP vector absolute differences 2020-04-22 20:46:18 +01:00
Lioncash
cb456f914b A64: Implement UMLAL{2}, UMLSL{2}, and UMULL{2}
Now that we have the helper function set up for the signed variants, we
can also modify it to be used with the unigned ones by performing a zero
extension instead of a sign extension.
2020-04-22 20:46:18 +01:00
MerryMage
ba84e7a8de A64: Implement FNMSUB 2020-04-22 20:46:18 +01:00
Lioncash
3576c02d91 A64: Implement SMLSL{2} 2020-04-22 20:46:18 +01:00
MerryMage
a1042cfcd8 A64: Implement FNMADD 2020-04-22 20:46:18 +01:00
Lioncash
ada5c0b2fa A64: Implement SMLAL{2} 2020-04-22 20:46:18 +01:00
MerryMage
0d83032a6f A64: Implement FMSUB 2020-04-22 20:46:18 +01:00
Lioncash
2d1aca25e6 A64: Implement SMULL{2} 2020-04-22 20:46:18 +01:00
MerryMage
69e00d225c A64: Implement FMADD 2020-04-22 20:46:18 +01:00
MerryMage
8c90fcf58e IR: Implement FPMulAdd 2020-04-22 20:46:18 +01:00
Lioncash
c5ae9107a9 A64: Implement SABAL/SABAL2 and SABDL/SABDL2
Now that we have a helper function for the unsigned variants, we can
modify it to also be usable with the signed variants.
2020-04-22 20:46:18 +01:00
Lioncash
24e3299276 A64: Implement FCMGT, FCMGE (register) vector double and single precision variants 2020-04-22 20:46:18 +01:00
Lioncash
26d4473851 A64: Implement UABAL/UABAL2 2020-04-22 20:46:18 +01:00
Lioncash
350bc70be8 A64: Implement FCMGT, FCMGE, FCMLE, FCMLT (zero) vector double and single precision variants. 2020-04-22 20:46:18 +01:00
Lioncash
3397742c74 A64: Implement UABDL/UABDL2 2020-04-22 20:46:18 +01:00
Lioncash
c695da1cf3 ir: Add opcode for floating-point GE and GT comparisons
The rest of the comparisons can be implemented in terms of these two
2020-04-22 20:46:18 +01:00
Lioncash
6de5ed96e5 emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs
Shortens code-gen down to a single instruction in the 64-bit path.
2020-04-22 20:46:18 +01:00
Lioncash
9054d1c20b A64: Implement LDR (literal, SIMD&FP) 2020-04-22 20:46:18 +01:00
Lioncash
0da5e949a8 Correct typo in DataCacheOperation enum
Fixes a typo for the InvalidateByVAToPoC enum entry. Given yuzu is the
only known user of 64-bit mode and it doesn't use this value, we can get
away with changing this.
2020-04-22 20:46:18 +01:00
Lioncash
9736e2cce2 A64: Implement FABS' half-precision variant 2020-04-22 20:46:18 +01:00
Lioncash
6e5750e4ec A64: Implement FABS' single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
7bce8d8757 A64: Implement URSHR (scalar) and URSRA (scalar)
Now that the utility function is all set up from implementing SRSRA, the
unsigned variants can now be trivially implemented by modifying the
utility function to perform a logical shift right instead of an
arithmetical shift right for the unsigned case.
2020-04-22 20:46:18 +01:00
Lioncash
1e70a589b0 A64: Implement SRSRA (scalar) 2020-04-22 20:46:18 +01:00
Lioncash
998aef07f6 A64: Implement SRSHR (scalar) 2020-04-22 20:46:17 +01:00
Lioncash
7c0250e9f8 A64: Implement SABA 2020-04-22 20:46:17 +01:00
Lioncash
f00789e6f7 A64: Implement SABD 2020-04-22 20:46:17 +01:00
Lioncash
1e10017f4b ir: Add opcodes for signed absolute differences 2020-04-22 20:46:17 +01:00
Tillmann Karras
d3b44c1b5a decoder_detail: use structured bindings 2020-04-22 20:46:17 +01:00
Lioncash
f745eb28bf simd_two_register_misc: Handle 64-bit case for SCVTF_int_4 2020-04-22 20:46:17 +01:00
Lioncash
3f6c529da2 ir: Add opcode to perform the vector conversion S64->F64
Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us
2020-04-22 20:46:17 +01:00
Lioncash
0e61ee6bf6 A64: Implement SHLL/SHLL2 2020-04-22 20:46:17 +01:00
Lioncash
43e6e98c3b A64: Add missing decoding for PRFM (unscaled offset) 2020-04-22 20:46:17 +01:00
Lioncash
f2a85d5601 A64: Implement UHSUB 2020-04-22 20:46:17 +01:00
Lioncash
b33360a324 A64: Implement SHSUB 2020-04-22 20:46:17 +01:00
Lioncash
44a5f8095a ir: Add opcodes for performing vector halving subtracts 2020-04-22 20:46:17 +01:00
Lioncash
4f37c0ec5a A64: Implement SM4EKEY 2020-04-22 20:46:17 +01:00
Lioncash
3bde3347a5 A64: Implement SM4E 2020-04-22 20:46:17 +01:00
Lioncash
b312d28295 ir: Add an opcode for doing an SM4 lookup table query 2020-04-22 20:46:17 +01:00
Lioncash
27a6d5f6ce emit_x64_vector: Use VPOPCNTB in EmitVectorPopulationCount() if AVX-512 BITALG is available 2020-04-22 20:46:17 +01:00
Lioncash
4dcc7724e0 A64: Implement UHADD 2020-04-22 20:46:17 +01:00
Lioncash
f8714f7250 A64: Implement SHADD 2020-04-22 20:46:17 +01:00
Lioncash
089096948a ir: Add opcodes for performing halving adds 2020-04-22 20:46:17 +01:00
Lioncash
3d00dd63b4 emit_x64_vector: Emit VPMINSQ and VPMINUQ for 64-bit vector min operations if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
b97b71b8aa emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
033e400df0 emit_x64_vector_floating_point: Deduplicate accurate NaN handling code
Allows the code to both be used from the 32 bit and 64 bit operations without duplicating code.
2020-04-22 20:46:17 +01:00
Lioncash
0f067b7330 emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
d4ee878cbd emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
b38dd191bd disassembler_arm: Remove rotation helper function in favor of Common::RotateRight
Mildly reduces the amount of duplicated behavior
2020-04-22 20:46:17 +01:00
Lioncash
51e4f1d9db emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32() 2020-04-22 20:46:17 +01:00
Lioncash
c692ccdd6d emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8() 2020-04-22 20:46:17 +01:00
Lioncash
b194313d8c emit_x64_vector: Vectorize fallback path in EmitVectorMinU32() 2020-04-22 20:46:17 +01:00
Lioncash
7ceda6d919 emit_x64_vector: Vectorize fallback path in EmitVectorMinU16() 2020-04-22 20:46:17 +01:00
Lioncash
cda85a1da0 emit_x64_vector: Vectorize fallback path in EmitVectorMinS32() 2020-04-22 20:46:17 +01:00
Lioncash
6e08eed210 emit_x64_vector: Vectorize fallback path in EmitVectorMinS8() 2020-04-22 20:46:17 +01:00
Lioncash
0fb6dce689 emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift
This can simply be merged with the previous one.
2020-04-22 20:46:17 +01:00
Lioncash
5b71b1337b emit_x64_vector: Avoid left shift of negative value in LogicalVShift
Now that we handle the signed variants, we also have to be careful about left shifts with negative values,
as this is considered undefined behavior.
2020-04-22 20:46:17 +01:00
Lioncash
9954d28868 a64_jitstate: Zero SP and PC on construction of A64JitState
Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent
2020-04-22 20:46:17 +01:00
Lioncash
4efbd40ea4 backend_x64/callback: Default virtual destructor in the cpp file
Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)
2020-04-22 20:46:17 +01:00
Lioncash
edd0b5c8c7 a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks
It's well-defined to static_cast a void* to its proper type.
2020-04-22 20:46:17 +01:00
Lioncash
e71612d394 A64: Implement SSHL (scalar) 2020-04-22 20:46:17 +01:00
Lioncash
ef1e69a1e3 A64: Implement SSHL (vector) 2020-04-22 20:46:17 +01:00
Lioncash
21974ee57e backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
Also adds IR opcodes to dispatch said variants
2020-04-22 20:46:17 +01:00
Lioncash
9fc89f0a0e emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size
Similar changes were done in emit_x64_vector, but these were missed.
2020-04-22 20:46:17 +01:00
Lioncash
af28e89a13 emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16() 2020-04-22 20:46:17 +01:00
Lioncash
cda75e2079 A64: Implement CMTST's scalar variant 2020-04-22 20:46:17 +01:00
Lioncash
0d20423ad5 emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32() 2020-04-22 20:46:17 +01:00
Lioncash
d70ee7c0d1 emit_x64_vector: Use VBPROADCAST where applicable and available
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2020-04-22 20:46:17 +01:00
Lioncash
bebe7235ae A64: Implement UZP1 and UZP2 2020-04-22 20:46:17 +01:00
Lioncash
26d77c6f09 ir: Add opcodes for performing vector deinterleaving 2020-04-22 20:46:17 +01:00
Lioncash
d6f9ed47d9 A64: Implement FNEG (half-precision) 2020-04-22 20:46:17 +01:00
Lioncash
7efbd73bac A64: Implement USHL (scalar) 2020-04-22 20:46:17 +01:00
Lioncash
41f4717f2b A64: Implement FNEG (vector) 2020-04-22 20:46:17 +01:00
Lioncash
ba1cc6366d A64: Implement RSUBHN/RSUBHN2 2020-04-22 20:46:17 +01:00
Lioncash
e41640fe33 A64: Implement RADDHN/RADDHN2 2020-04-22 20:46:17 +01:00
Lioncash
b719a6b3f7 A64: Implement XAR 2020-04-22 20:46:17 +01:00
Lioncash
0b1b131ec2 simd_two_register_misc: Factor out common comparison code
Gets rid of a tiny bit of duplicated code.
2020-04-22 20:46:17 +01:00
Lioncash
ed0b84da70 A64: Implement CMLE (zero)'s vector variant 2020-04-22 20:46:17 +01:00
Lioncash
b595a68ffa A64: Implement CMTST (vector) 2020-04-22 20:46:17 +01:00
Lioncash
48c7f8630c A64: Implement ADDHN{2} and SUBHN{2} 2020-04-22 20:46:17 +01:00
Lioncash
3acd9c9200 translate: zero extend result in Vpart when storing to lower part of vector 2020-04-22 20:46:17 +01:00
Lioncash
87ca63699f emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs 2020-04-22 20:46:17 +01:00
Lioncash
f17702f608 emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs 2020-04-22 20:46:17 +01:00
Lioncash
596a8dd1dd emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
Provides a better alternative to a fallback operation.
2020-04-22 20:46:17 +01:00
Lioncash
75fd4eaaaa emit_x64_vector: Get rid of some magic numbers in loop bounds 2020-04-22 20:46:17 +01:00
Lioncash
7b80ac25eb emit_x64_vector: Generify variable shift functions 2020-04-22 20:46:17 +01:00
Lioncash
4ec735f707 A64: Implement CMLE (zero)'s scalar variant 2020-04-22 20:46:17 +01:00
Lioncash
6534184df2 A64: Implement CMLT (zero)'s scalar single/double-precision variant 2020-04-22 20:46:17 +01:00
Lioncash
8863c9bb4b A64: Implement SHA512H2 2020-04-22 20:46:17 +01:00
Lioncash
033b890e25 A64: Implement SHA512H 2020-04-22 20:46:17 +01:00
Lioncash
d1f5b084b4 A64: Handle S32->F32 case for SCVTF (vector) 2020-04-22 20:46:17 +01:00
Lioncash
38fa984b53 IR: Add opcode for packed word->f32 conversions 2020-04-22 20:46:16 +01:00
Lioncash
b8587d8e34 A64: Implement SHA512SU1 2020-04-22 20:46:16 +01:00
Lioncash
44d846045a A64: Implement SHA512SU0 2020-04-22 20:46:16 +01:00
Lioncash
ca903c1585 A64: Implement SHA256H and SHA256H2 2020-04-22 20:46:16 +01:00
MerryMage
e4237c44eb A64: Implement SCVTF (vector, integer), scalar varaint 2020-04-22 20:46:16 +01:00
MerryMage
bfba38d0b6 impl: Reorganize scalar two-register misc instructions 2020-04-22 20:46:16 +01:00
Lioncash
ea582b17cc A64: Implement SHA256SU1 2020-04-22 20:46:16 +01:00
Lioncash
06c5dcaf5e simd_two_register_misc: Add missing zeroing of the vector for CMGT and CMLT 2020-04-22 20:46:16 +01:00
Lioncash
0d50d7314b A64: Implement CMGE (zero)'s vector variant 2020-04-22 20:46:16 +01:00
Lioncash
ab35dc0e78 A64: Implement MLS (by element) 2020-04-22 20:46:16 +01:00
Lioncash
1651e60462 A64: Implement MUL (by element) 2020-04-22 20:46:16 +01:00
MerryMage
a86d4093cd A64: Implement MLA (by element) 2020-04-22 20:46:16 +01:00
Lioncash
7f47402609 A64: Implement ABS (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
c8eb4528be A64: Implement SHA256SU0 2020-04-22 20:46:16 +01:00
Lioncash
181c3b0790 A64: Implement SHA1M 2020-04-22 20:46:16 +01:00
Lioncash
47bc97a71b A64: Implement SHA1P 2020-04-22 20:46:16 +01:00
Lioncash
718f3e9bb4 A64: Implement scalar variants of CMEQ, CMGT, and CMGE zero comparison instructions
These can trivially use the ScalarCompare helper function.
2020-04-22 20:46:16 +01:00
Lioncash
3ad4e547e4 A64: Implement scalar variant of NEG 2020-04-22 20:46:16 +01:00
Lioncash
b4f3051e4b simd: Relocate REV16, REV32 and REV64 vector variants to the proper file
These aren't scalar instruction variants.
2020-04-22 20:46:16 +01:00
Lioncash
19e276d10f A64: Implement CMEQ (register, scalar) 2020-04-22 20:46:16 +01:00
Lioncash
5b8c9e5146 A64: Implement CMHS (register, scalar) 2020-04-22 20:46:16 +01:00
Lioncash
78bb12276a A64: Implement CMHI (register, scalar) 2020-04-22 20:46:16 +01:00
Lioncash
c18b20b8d1 A64: Implement CMGE (register, scalar) 2020-04-22 20:46:16 +01:00
Lioncash
755981d0da A64: Implement CMGT (register, scalar) 2020-04-22 20:46:16 +01:00
Lioncash
da6627124b A64: Implement SHA1C 2020-04-22 20:46:16 +01:00
Lioncash
3c013bd9f8 A64: Implement SLI (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
154cac594a A64: Implement SRI (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
6bcfdba1ad general: Remove unused lambda captures
Resolves warnings that occur in Xcode 9.3
2020-04-22 20:46:16 +01:00
Lioncash
205ca6b4cb A64: Implement SHA1SU1 2020-04-22 20:46:16 +01:00
Lioncash
16a001b9ff A64: Implement SHA1SU0 2020-04-22 20:46:16 +01:00
Lioncash
3b6db59850 A64: Implement TRN2 2020-04-22 20:46:16 +01:00
Lioncash
30e158f8d0 A64: Implement TRN1 2020-04-22 20:46:16 +01:00
Lioncash
52cad2d9d0 A64: Implement SSRA (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
255a33936d A64: Implement SSHR (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
6723b00497 A64: Implement USRA (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
d56fa8f735 A64: Implement USHR (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
870e418b0b A64: Implement SHL (scalar) 2020-04-22 20:46:16 +01:00
Lioncash
97f2bea4f2 A64: Implement SM3PARTW1 2020-04-22 20:46:16 +01:00
Lioncash
e268b110f0 simd_sha512: Simplify RAX1
Now that the vector rotation helpers are in, replace the explicit
shifting with the relevant helper function that does the same thing.

Simply tidies up code; no behavioral changes are made.
2020-04-22 20:46:16 +01:00
Lioncash
20d2491267 A64: Implement SM3PARTW2 2020-04-22 20:46:16 +01:00
Lioncash
e1b662e90c ir: Add helper functions for vector rotation 2020-04-22 20:46:16 +01:00
Lioncash
8a60a63a8b A64: Implement SM3TT2B 2020-04-22 20:46:16 +01:00
Lioncash
b3d4c02098 A64: Implement SM3TT2A 2020-04-22 20:46:16 +01:00
Lioncash
7fbccabd81 A64: Implement SM3TT1B 2020-04-22 20:46:16 +01:00
Lioncash
769373b3ed A64: Implement SM3TT1A 2020-04-22 20:46:16 +01:00
Lioncash
2d269fdcc7 simd_shift_by_immediate: Merge signed/unsigned helper functions
Gets rid of a little more code duplication.
2020-04-22 20:46:16 +01:00
Lioncash
d5461be6b4 A64: Implement SM3SS1 2020-04-22 20:46:16 +01:00
Lioncash
2db032ac83 A64: Implement SRI (vector) 2020-04-22 20:46:16 +01:00
Lioncash
11005cfe26 A64: Implement SLI (vector) 2020-04-22 20:46:16 +01:00
Lioncash
e3d9bf55e7 A64: Implement SRSRA (vector) 2020-04-22 20:46:16 +01:00
Lioncash
bc6016cad7 A64: Implement SRSHR (vector) 2020-04-22 20:46:16 +01:00
MerryMage
6c9c829a08 imm: Add additional bit position checks to Imm::Bits 2020-04-22 20:46:16 +01:00
MerryMage
be907a61f7 math_util: rvalue references for std::forward 2020-04-22 20:46:16 +01:00
Lioncash
a2f8cdf0a3 A64: Implement SSUBL/SSUBL2 2020-04-22 20:46:16 +01:00
Lioncash
d456fb85c8 A64: Implement SADDL/SADDL2 2020-04-22 20:46:16 +01:00
Lioncash
5c9e7f328d A64: Implement USUBL/USUBL2 2020-04-22 20:46:16 +01:00
Lioncash
88d70e3b8a A64: Implement UADDL/UADDL2 2020-04-22 20:46:16 +01:00
Lioncash
4b3d70de5f simd_shift_by_immediate: Factor out common code in shift instructions
Gets rid of partial duplication of the same code for instructions that only have a small behavior difference to them.

e.g. The only difference between SSHR and SSRA is that SSRA adds an accumulator before storing the result.
2020-04-22 20:46:16 +01:00
Lioncash
56803f5203 A64: Implement URSRA (vector) 2020-04-22 20:46:16 +01:00
Lioncash
8afdf4b23d A64: Implement URSHR (vector) 2020-04-22 20:46:16 +01:00
Lioncash
16613ee066 A64: Implement RSHRN/RSHRN2 2020-04-22 20:46:15 +01:00
Lioncash
937990fd2a A64: Implement SHRN/SHRN2 2020-04-22 20:46:15 +01:00
Lioncash
80e005e5b5 A64/translate: Amend I() to also handle u8 and u16 immediates
This is necessary for instructions like SRSHR, and other related instructions.
2020-04-22 20:46:15 +01:00
MerryMage
7969871aa3 A64: Implement FMOV (vector, immediate) and mark other SIMD modified immediate instructions as unallocated 2020-04-22 20:46:15 +01:00
MerryMage
5c95e28ed0 A64: Implement ZIP2 2020-04-22 20:46:15 +01:00
MerryMage
871aefb9a0 decoder/a64: Tweak ordering algorithm
Ensuring only instruction families are sorted with each other in
the fashion previously devised does not admit a total ordering.
2020-04-22 20:46:15 +01:00
MerryMage
575590d18d ir_emitter: Remove overloads
Having overloads made explicit casting necesssary for these functions when
using types like UAny.
2020-04-22 20:46:15 +01:00
Lioncash
83ff7a43d1 A64: Implement RBIT (vector) 2020-04-22 20:46:15 +01:00
Lioncash
64b1f2d468 ir: Add opcode for reversing bits in a vector 2020-04-22 20:46:15 +01:00
Lioncash
9de60b60bb A64/translate: Amend instruction prototypes erroneously marked as taking Reg
Makes the prototypes consistent
2020-04-22 20:46:15 +01:00
Lioncash
cf81f04ed3 A64: Implement RAX1 2020-04-22 20:46:15 +01:00
Lioncash
7371e63a7b a64_get_set_elimination_pass: Make TrackingType enum an enum class
Prevents placing single letter enum members into the surrounding scope.
2020-04-22 20:46:15 +01:00
Lioncash
7bcb1c115a A64: Implement ABS (vector) 2020-04-22 20:46:15 +01:00
Lioncash
e33dcce14a ir: Add opcodes for performing vector absolute values 2020-04-22 20:46:15 +01:00
Lioncash
84d49309b9 A64: Implement USUBW/USUBW2 2020-04-22 20:46:15 +01:00
Lioncash
e20fce6b5a A64: Implement SSUBW/SSUBW2 2020-04-22 20:46:15 +01:00
Lioncash
00af6eeab9 A64: Implement SADDW/SADDW2 2020-04-22 20:46:15 +01:00
MerryMage
78a047f0f9 A64: Implement EXT 2020-04-22 20:46:15 +01:00
MerryMage
3472f371df IR: Implement VectorExtract, VectorExtractLower IR instructions 2020-04-22 20:46:15 +01:00
MerryMage
8bba37089e A64: Implement UADDW 2020-04-22 20:46:15 +01:00
MerryMage
5c47f03888 A64: Implement FMUL (vector) 2020-04-22 20:46:15 +01:00
Lioncash
a6e264c2dd A64: Implement UABA
Now that we have unsigned absolute difference capabilities, we can just use this to
append onto the result via a vector add.
2020-04-22 20:46:15 +01:00
Lioncash
c2e7364d3e A64: Implement UABD 2020-04-22 20:46:15 +01:00
Lioncash
ad5cf584ce ir: Add opcodes for performing vector unsigned absolute differences 2020-04-22 20:46:15 +01:00
Lioncash
7780af56e3 ir_emitter: Make immediate member functions const qualified
These don't modify class state
2020-04-22 20:46:15 +01:00
Lioncash
701f43d61e IR: Add opcodes for interleaving upper-order bytes/halfwords/words/doublewords
I should have added this when I introduced the functions for interleaving
low-order equivalents for consistency in the interface.
2020-04-22 20:46:15 +01:00
Lioncash
94f0fba16b A64: Implement SHA1H
This is a fairly trivial instruction it's essentially:

result = ROL(data, 30);
2020-04-22 20:46:15 +01:00
Lioncash
3985f7bf84 emit_x64_data_processing: Deduplicate some code in zero-extension functions
EmitZeroExtendByteToLong() can be implemented in terms of EmitZeroExtendByteToWord() and
EmitZeroExtendHalfToLong() can be implemented in terms of EmitZeroExtendHalfToWord().
2020-04-22 20:46:15 +01:00
Lioncash
40ec25356b A64: NOP immediate variant of PRFM
Makes behavior identical to the literal variant of PRFM. Given this is simply a hint instruction,
this is valid behavior. The upside is that we don't fall back to Unicorn unnecessarily whenever
the instruction is encountered.
2020-04-22 20:46:15 +01:00
MerryMage
e7b60189b3 abi: Missing includes' 2020-04-22 20:46:15 +01:00
MerryMage
cdc5c3ad95 emit_x64_floating_point: Near jump instead of short jump in FPMinNumberic{32,64} 2020-04-22 20:46:15 +01:00
Lioncash
73b9e4b276 A64: system: Use an enum class for MRS/MSR register encodings
Reduces the need to manually write out the register bit encodings repeatedly.
2020-04-22 20:46:15 +01:00
MerryMage
df4ee0f51e emit_X64_floating_point: Near jmp to end instead of short jmp
Jump destination can be further than what can be reached in a short
jump under some FPCR options.
2020-04-22 20:46:15 +01:00
Lioncash
b8d5765f9b emit_x64_vector: Fix typo in VectorShuffleImpl
This is supposed to be pshufd, not pshufw (which only allows a 64-bit operand)
2020-04-22 20:46:15 +01:00
Lioncash
586b00d11d A64: Implement REV64 2020-04-22 20:46:15 +01:00
Lioncash
ade595e377 bit_util: Do nothing in RotateRight if the rotation amount is zero
Without this sanitizing it's possible to perform a shift with a shift
amount that's the same size as the type being shifted. This actually
occurs when decoding ORR variants.

We could get fancier here and make this branchless, but we don't
really use RotateRight in any performance intensive areas.
2020-04-22 20:46:15 +01:00
Lioncash
9128988dc3 A64: Implement REV32 (vector) 2020-04-22 20:46:15 +01:00
Lioncash
6b0010c940 ir: Add IR opcodes for emitting vector shuffles
This uses the ARM terminology for sizes (Halfword -> 2 bytes, Word -> 4 bytes)
as opposed to the x86 terminology of (Word -> 2 bytes, Double word -> 4 bytes)
2020-04-22 20:46:15 +01:00
Lioncash
eb2d28d2b1 emit_x64_vector_floating_point: Fix out of bounds array access in EmitVectorOperation64 2020-04-22 20:46:15 +01:00
Lioncash
6ad1bce5e0 A64: Implement REV16 (vector) 2020-04-22 20:46:15 +01:00
Lioncash
6177c2c63d CMakeLists: Add fp_util, macro_util and math_util headers
Allows the headers to show up within IDEs
2020-04-22 20:46:15 +01:00
Lioncash
7a66224d9a A64: Implement EOR3 and BCAX 2020-04-22 20:46:15 +01:00
MerryMage
be5047c7c2 impl: Update PC when raising exception 2020-04-22 20:46:15 +01:00
MerryMage
49cc6d7fad A64: Implement FDIV (vector) 2020-04-22 20:46:15 +01:00
MerryMage
fd075d8d68 system: Raise exception for YIELD, WFE, WFI, SEV, SEVL 2020-04-22 20:46:15 +01:00
MerryMage
c832cec96d Correct FPSR and FPCR 2020-04-22 20:46:15 +01:00
MerryMage
147284427b A64: Implement USHL 2020-04-22 20:46:15 +01:00
MerryMage
fd8f4c1195 A64: Implement UCVTF (vector, integer), scalar variant 2020-04-22 20:46:15 +01:00
MerryMage
be57608353 A64: Partially implement FCVTZU (scalar, fixed-point) and FCVTZS (scalar, fixed-point) 2020-04-22 20:46:15 +01:00
MerryMage
e4697b1676 A64: Implement system register TPIDR_EL0 2020-04-22 20:46:15 +01:00
MerryMage
e3da92024e A64: Implement system registers FPCR and FPSR 2020-04-22 20:46:15 +01:00
MerryMage
9e4e4e9c1d A64: Implement system register CNTPCT_EL0 2020-04-22 20:46:15 +01:00
MerryMage
1e15283d00 A64: Implement system register CTR_EL0 2020-04-22 20:46:15 +01:00
MerryMage
58fbb3ff1b A64: Implement NEG (vector) 2020-04-22 20:46:15 +01:00
MerryMage
710d09471b IR: Add IR instruction ZeroVector 2020-04-22 20:46:15 +01:00
MerryMage
2721bb5ace emit_x64_floating_point: Add maybe_unused to preprocess parameter 2020-04-22 20:46:15 +01:00
MerryMage
0575e7421b A64: Implement FMINNM (scalar) 2020-04-22 20:46:15 +01:00
MerryMage
1c9804ea07 A64: Implement FMAXNM (scalar) 2020-04-22 20:46:15 +01:00
MerryMage
1dfce0894d constant_pool: Add frame parameter 2020-04-22 20:46:14 +01:00
MerryMage
bd2b415850 A64: Implement ADDP (scalar) 2020-04-22 20:46:14 +01:00
MerryMage
84f1c9b7f4 reg_alloc: Only exchange GPRs 2020-04-22 20:46:14 +01:00
MerryMage
9df3793af0 A64: Implement DUP (element), scalar variant 2020-04-22 20:46:14 +01:00
MerryMage
6541ec064d emit_x64_floating_point: Correct FP{Max,Min}{32,64} implementations for -0/+0 2020-04-22 20:46:14 +01:00
MerryMage
2080a51f41 A64: Implement FMAX (scalar), FMIN (scalar) 2020-04-22 20:46:14 +01:00
MerryMage
7c193485e1 a64/config: Allow NaN emulation accuracy to be set 2020-04-22 20:46:14 +01:00
MerryMage
a3df46a75a a64_emit_x64: Add conf to A64EmitContext 2020-04-22 20:46:14 +01:00
MerryMage
0e157b0198 A64: Implement FSQRT (scalar) 2020-04-22 20:46:14 +01:00
MerryMage
07520f32c3 backend_x64: Accurately handle NaNs 2020-04-22 20:46:14 +01:00
MerryMage
e97581d063 fuzz_with_unicorn: Print AArch64 disassembly 2020-04-22 20:46:14 +01:00
MerryMage
01c1e9017e T32: Add initial decoder list 2020-04-22 20:46:14 +01:00
MerryMage
ccf7df057b simd_three_same: Add VectorZeroUpper to CMGE (vector) and CMHS (vector) 2020-04-22 20:46:14 +01:00
MerryMage
8cebb87d0d A64: Implement CMGT (zero), CMEQ (zero), CMLT (zero) 2020-04-22 20:46:14 +01:00
MerryMage
7f68d556ab decoder/a64: Rearrange SIMD two-register misc decoders 2020-04-22 20:46:14 +01:00
MerryMage
d5af052f06 A64: Implement CMGE (register) 2020-04-22 20:46:14 +01:00
MerryMage
9d85991906 A64: Implement CMHI, CMHS 2020-04-22 20:46:14 +01:00
MerryMage
e2b9b7c5b0 IR: Implement Vector{Less,Greater}{,Equal}{Signed,Unsigned} 2020-04-22 20:46:14 +01:00
MerryMage
0df6725f73 A64: Implement SMAX, SMIN, UMAX, UMIN 2020-04-22 20:46:14 +01:00
MerryMage
47c0ad0fc8 IR: Implement Vector{Max,Min}{Signed,Unsigned} 2020-04-22 20:46:14 +01:00
MerryMage
adb7f5f86f A64: Implement CMGT (register) 2020-04-22 20:46:14 +01:00
MerryMage
f4775910f5 IR: Implement VectorGreaterSigned 2020-04-22 20:46:14 +01:00
MerryMage
1f5b3bca43 Exclusive fixups
* Incorrect size of exclusive_address
* Disable tests on exclusive memory instructions for now
2020-04-22 20:46:14 +01:00
MerryMage
f3fa4a042f a64_emit_x64: EmitExclusiveWrite: Make MSVC happy (narrowing conversion warning) 2020-04-22 20:46:14 +01:00
MerryMage
8698f057d0 A64: Implement STXP, STLXP, LDXP, LDAXP 2020-04-22 20:46:14 +01:00
MerryMage
2a6619d59c A64: Implement CLREX 2020-04-22 20:46:14 +01:00
MerryMage
b7a2c1a7df A64: Implement STXRB, STXRH, STXR, STLXRB, STLXRH, STLXR, LDXRB, LDXRH, LDXR, LDAXRB, LDAXRH, LDAXR 2020-04-22 20:46:14 +01:00
MerryMage
a6cc667509 Direct Page Table Access: Handle address spaces less than the full 64-bit in size 2020-04-22 20:46:14 +01:00
MerryMage
f45a5e17c6 Implement direct page table access 2020-04-22 20:46:14 +01:00
MerryMage
bfd3e30c75 callbacks: Member functions should be const 2020-04-22 20:46:14 +01:00
MerryMage
9f2f08db8d a64_emit_x64: Implement {Read,Write}Memory128 in terms of a function call 2020-04-22 20:46:14 +01:00
MerryMage
6c4773e85b abi: Add RAX to ABI_ALL_CALLER_SAVE 2020-04-22 20:46:14 +01:00
MerryMage
8756487554 A64: Partially implement MRS 2020-04-22 20:46:14 +01:00
MerryMage
bfd65bedfe A64: Implement DSB, DMB 2020-04-22 20:46:14 +01:00
MerryMage
5edd623b9d Implement DC instructions 2020-04-22 20:46:14 +01:00
Lioncash
a9153218bd A64: Implement NOT (vector) 2020-04-22 20:46:14 +01:00
MerryMage
2cb0a699ba IR: Implement FPMax, FPMin 2020-04-22 20:46:14 +01:00
MerryMage
aed4fd3ec3 A64: Implement FADD (vector), vector variant 2020-04-22 20:46:14 +01:00
MerryMage
98c8e7d1af IR: Implement FPVectorAdd 2020-04-22 20:46:14 +01:00
MerryMage
5f77ab28ee A64: Implement SSHLL, SSHLL2 2020-04-22 20:46:14 +01:00
MerryMage
eae518a338 IR: Implement VectorSignExtend 2020-04-22 20:46:14 +01:00
MerryMage
3738043e58 A64: Implement DUP (element), vector variant 2020-04-22 20:46:14 +01:00
MerryMage
ce7628b6b5 load_store_multiple_structures: Improve IR codegen for selem == 1 case 2020-04-22 20:46:14 +01:00
MerryMage
f1cb5581c9 A64: Implement FSUB (vector) 2020-04-22 20:46:14 +01:00
MerryMage
b9cd345ddc IR: Implement FPVectorSub 2020-04-22 20:46:14 +01:00
MerryMage
851fc83445 emit_x64_vector: EmitOneArgumentFallback 2020-04-22 20:46:14 +01:00
MerryMage
f378d2ef1b Forward declare IR::Opcode and IR::Type where possible 2020-04-22 20:46:14 +01:00
MerryMage
6c9b4f0114 A64: Implement CNT 2020-04-22 20:46:14 +01:00
MerryMage
303088a51e IR: Implement VectorPopulationCount 2020-04-22 20:46:14 +01:00
MerryMage
1dd2b33b87 A64: Implement MLS (vector) 2020-04-22 20:46:14 +01:00